Logic with trees Logic with trees is a new and original introduction to modern formal logic. Unlike most texts on the s...

Author:
Colin Howson

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Logic with trees Logic with trees is a new and original introduction to modern formal logic. Unlike most texts on the subject, it includes discussions of more philosophical issues such as truth, conditionals and modal logic. Preferring explanation and argument to intimidatingly rigorous development, Colin Howson presents the formal material in a clear and informal style that both beginners and those with some knowledge of formal methods will appreciate. Examples and exercises guide readers through the book, and answers to selected exercises at the end allow them to monitor their own progress. Logic with Trees gives students • a complete and clear account of the truth tree system for first-order logic • an understanding of the importance of logic and of its relevance to other disciplines • the skills to grasp sophisticated formal reasoning techniques that are necessary to explore complex metalogic • and the ability to contest claims that ‘ordinary’ reasoning is well represented by formal first-order logic Howson’s carefully planned textbook covers both truth-functional and full first-order logic, using the truth tree or semantic tableau approach; he gives completeness and soundness proofs for both truth-functional and first-order trees, and makes extensive use of induction. In addition, he discusses alternative deductive systems, transfinite numbers and categoricity, the Löwnheim-Skolem theorems and the celebrated theorems of Gödel and Church. The book concludes with an account of Kripke’s attempt to solve the Liar Paradox and a discussion of the weaknesses of the truth-functional account of conditionals. Logic with Trees will be particularly useful for those who feel wary of formal methods, since it shows how simple even quite sophisticated formal reasoning can be. It will be of interest to students of philosophy at undergraduate level and beyond, as well as to students of mathematics and computer science. Colin Howson is Reader in Logic at the London School of Economics and Political Science.

The London School of Economics and Political Science/ Routledge Books published under the joint imprint of LSE/Routledge are works of high academic merit approved by the Publications Committee of the London School of Economics and Political Science. These publications are drawn from the wide range of academic studies in the social sciences for which the LSE has an international reputation.

Logic with trees An introduction to symbolic logic

Colin Howson

London and New York

First published 1997 by Routledge 11 New Fetter Lane, London EC4P 4EE This edition published in the Taylor & Francis e-Library, 2005. “To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.” Simultaneously published in the USA and Canada by Routledge 29 West 35th Street, New York, NY 10001 © 1997 Colin Howson All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloguing in Publication Data Howson, Colin. Logic with trees: an introduction to symbolic logic/Colin Howson. p. cm. 1. Logic, Symbolic and mathematical. I. Title. BC135.H68 1996 96–7315 160–dc20 ISBN 0-203-97673-8 Master e-book ISBN

ISBN - (Adobe e-Reader Format) ISBN 0-415-13341-6 (hbk) 0-415-13342-4 (pbk)

To Minou, who would have enjoyed sitting on this book

Contents

Acknowledgments Introduction

viii ix

Part I Truth-functional logic 1 2 3 4

The basics Truth trees Propositional languages Soundness and completeness

3 15 31 47

Part Π First-order logic 5 6 7 8 9 10 11 12

Introduction First-order languages: syntax and two more tree rules First-order languages: semantics Soundness and completeness Identity Alternative deductive systems for first-order logic First-order theories Beyond the fringe

61 74 86 99 111 128 138 154

List of notation Answers to selected exercises References Name index Subject index

165 168 182 184 187

Acknowledgments I should like to express my gratitude to Tony Dale, Gustavo Fernandez and Tony Ungar for their detailed comments on earlier versions of this book. Other people who have offered very helpful advice and discussion are Rose Gibson, R.I.G. Hughes, Peter Milne, Margaret Morrison, Jan von Plato, Demetris Portides, Adam Rieger, Aldo Visintin, John Worrall, Elie Zahar, and many undergraduates and postgraduates of the London School of Economics. I should like to express my thanks also to Theresa Hunt, Pat Gardner, Cynthia Ma and Towfic Shomar for their assistance in preparing the typescript.

Introduction Logic was one of the first scientific disciplines to be identified and studied systematically. For various reasons, which historians of ideas still disagree about, the Stoic and Aristotelian beginnings were left undeveloped, and no real advances were made until over two thousand years later. Then, in the second half of the nineteenth century, a succession of mathematicians took up the subject, and as a result of their attentions it grew rapidly into a discipline of great power; it has generated results which have transformed the way we think, at a quite fundamental level. In particular, it has given us information about the limitations of theorising that could hardly have been imagined, even if the questions could have been formulated, only a century ago. Some of these results have been interpreted in extraordinary ways. In Douglas Hofstadter’s best-seller Gödel, Escher, Bach (1979), two celebrated theorems of Gödel have been compared to the works of Bach. Elsewhere, they have convinced some people that we are more than machines, and others that we are no more than machines. On a more practical level, however, logic is now acknowledged to be of central importance, particularly in computer science and artificial intelligence, and anyone who wants to work in the area of software development will have to have an increasingly considerable degree of acquaintance with it. ‘Logic’, as the foreword to one of a rapidly increasing number of recent texts on logic oriented towards applications in computing attests, is ‘the calculus of computer science’ 1 . Indeed, because of its central role there, logic is now playing a similar enabling role in the information technology revolution to that which mathematics played in the scientific revolution of the seventeenth and eighteenth centuries. Logic texts exemplify a variety of proof systems. The one used in this book is the increasingly popular, arguably the most user-friendly, and the most obviously machineimplementable system, based on the semantic tableau method pioneered by the Dutch logician Evert Beth. It has been developed and simplified since, receiving its classic exposition in Raymond Smullyan (1968). Recently Richard Jeffrey has used a simplified form of it in his marvellous Formal Logic: Its Scope and Limits (1994). The present book is much more elementary than Smullyan’s, while I have attempted to introduce more standard material about first-order logic than Jeffrey does (including, in Chapter 3, an exposure to the crucially important role in many metatheorems played by induction), and less of the theory of computation. I have also added some ‘philosophical’ discussions, of truth and the Liar Paradox, categoricity, second-order logic, modal logic and conditionals. This book can be used in various ways. Chapters 1,2, 4 (the unstarred sections), 5 and 6 are material for an introductory course, while 1–9 would make up a comprehensive one-year course in first-order logic, and are suitable for students at both undergraduate and postgraduate level with no mathematical background who want to be able to

understand the mesh of syntactic and semantic arguments that makes up modern formal logic. Chapter 11 attempts to give some idea of the depth and significance of the classic results of modern (meta)logic, while Chapter 12 outlines some of the ways in which firstorder logic has been extended, and some of the principal objections brought against the representation of conditionals in first-order logic. These chapters could be used as a supplementary text in philosophy of logic, with the earlier material used as a source of reference for the main technical results of first-order logic. Proofs of soundness and completeness theorems for truth-functional and full first-order truth trees are given in Chapters 4 and 8. In Chapter 10 there is a discussion of examples of two of the main alternative proof systems, Hilbertstyle and Natural Deduction. Various sections of the first eight chapters are starred, to indicate that the material is not so elementary there, and starred exercises indicate a greater level of difficulty. I have tried to fulfil three principal aims: (i) to give a complete and clear account of the truth tree system for first-order logic, and of the important metatheorems associated with it; (ii) to show why logic is an exciting and flourishing discipline; and (iii) to show that the sorts of formal techniques exploited in proving even some ‘deep’ metalogical results are within the grasp of even determinedly non-mathematical students; for example, the various soundness and completeness proofs are not intrinsically difficult, and are certainly within the capacity of the non-specialist in logic to work through and understand. There are frequent failures in the book to achieve, and sometimes to approach, the highest standards of rigour, which I hope can be pardoned as sacrifices to clarity. There is no shortage of very rigorous texts for those who want them. One particular lapse is a more or less systematic failure to make explicit the ‘use-mention’ distinction. Labouring that distinction with typographical devices of one sort or another both disfigures a text and makes it difficult to read. Usually context suffices to distinguish use from mention, and there are warnings where I believe that there is any danger of conflation. Those who believe that departure from the use-mention orthodoxy approaches mortal sin may be induced to take a more lenient view by noting the inclusion of a discussion of and running references to the object-language-metalanguage distinction, and a separate discussion of the Liar Paradox.

Note

1 Foreword to Garton 1990.

Part I Truth-functional logic

Chapter 1 The basics 1 DEDUCTIVELY VALID INFERENCE There is much more to logic than the question of what makes inferences deductively valid or invalid, but to most people that is what logic is all about, so that is where we shall begin. One of the most basic features of these inferences is that they seem to be composed of declarative sentences, that is to say sentences which make assertions capable of being true or false. ‘Boris Yeltsin became President of Russia in 1993’, ‘All hydrogen atoms have one proton in their nucleus’ and ‘Michelangelo painted the ceiling of the Sistine Chapel’ are declarative sentences, and (we believe) true ones at that. ‘Shut that door!’ and ‘Is Hanoi in Scotland?’ are not. Neither is Chomsky’s funny example ‘Colourless green ideas sleep furiously’, which has the grammatical form, but only the form, of a fact-stating sentence. So far, so good. The sentences composing an inference are its premises and conclusion, the latter usually signalled by the prefix ‘therefore’ (for which, for brevity, we shall often use the symbol ∴). If the inference is deductively valid the conclusion is called a deductive or logical consequence, or simply consequence, of the premises. What else do we know? Well, one of the most familiar facts about deductively valid inferences, and the one which probably goes farthest towards explaining the importance they have always been accorded, is that it is impossible for their conclusions to be false if their premises are true: if anything is basic to the notion of deduction, that surely is. Consider this example, known as a disjunctive syllogism: It’s raining or it’s snowing. It’s not raining. ∴ It’s snowing. Clearly, you don’t have to know whether it’s actually raining or not, or snowing or not, to know that if the premises are true, so too is the conclusion. Not only is the conclusion true if the premises are. The conclusion must be true if the premises are true; there is no possibility of its being false. Not only is this the most important property of deductively valid inferences; it is difficult to think of any other that has that same generality. This being so, we might as well take it as the defining property, and accordingly frame the following Provisional definition: a valid deductive inference is one whose premises cannot all be true and conclusion false. The definition is provisional because the word ‘cannot’ itself rather obviously needs a

Logic with trees

4

definition, and providing an adequate one is not trivial: most of this book will be occupied in the task. But one thing we do know is that ‘cannot’ here has nothing at all to do with empirical fact, as it does in the statement that water cannot unaided run uphill. ‘Cannot’ in this context refers to a logical impossibility. It is a logical, not merely a physical, impossibility that ‘It’s snowing’ is false if both ‘It’s either raining or it’s snowing’ and ‘It’s not raining’ are true (assuming sameness of spatio-temporal reference in premises and conclusion). Here are two more examples to consider. If Lev is in Moscow then Irina is in Kiev. Lev is in Moscow. ∴ Irina is in Kiev. Cain was hairy and Abel was his victim. ∴ Cain was hairy. It is intuitively clear that these remain deductively valid, in the sense of the definition above, whatever sentences are substituted for ‘Cain was hairy’, ‘Abel was his victim’, ‘Lev is in Moscow’, ‘Irina is in Kiev’ and, in the disjunctive syllogism, ‘It’s raining’ and ‘it’s snowing’. Another way of putting it is to say that if we replace these sentences by the letters A, B, C, D, E and F, the respective formal representations (or formalisations) of these inferences E or F not E ∴F If A then B A ∴B C and D ∴C will always generate deductively valid inferences when the letters A, B, C, D, E and F are replaced by any sentences. An explanation of why this is so will plausibly rest on an analysis of the logical role played by the particles ‘and’, ‘or’, ‘not’, and ‘if… then—’. Now, a common method of analysing some phenomenon is to construct a model of it and see whether the model behaves in a way sufficiently resembling what it is supposed to model. This will be our procedure. The model, which will be presented in a systematic form in Chapter 3, is called a propositional language. ‘And’, ‘or’, ‘not’, etc. are basic syntactical items of these languages, and in the following sections we shall describe the way they are used to form compound truth-functional sentences, and the rules which determine how truth and falsity should be ascribed to these. (The syntax of a language is the set of rules which determine its formal structure, that is to say the way its basic vocabulary is organised into well-formed expressions, among which are the sentences of the language; the rules by which the sentences are equipped with truthconditions constitute the language’s semantics.)

The basics

5

2 SYNTAX: CONNECTIVES AND THE PRINCIPLE OF COMPOSITION ‘And’, ‘or’, ‘if…then—’ are structural items, called connectives by logicians, which articulate sentences into further sentences. ‘Cain was hairy’ and ‘Abel was his victim’ are said to be conjoined by ‘and’ to yield the conjunction ‘Cain was hairy and Abel was his victim’; the two sentences forming the conjunction are its conjuncts. ‘Not’ operates on the sentence ‘It’s raining’ to generate its negation ‘It’s not raining.’ ‘It’s raining’ and ‘it’s snowing’ are disjoined by ‘or’ to form the disjunction ‘It’s raining or it’s snowing’ of those two sentences, which are called the disjuncts. The sentences ‘Lev is in Moscow’ and ‘Irina is in Kiev’ are combined into the conditional sentence ‘If Lev is in Moscow then Irina is in Kiev’; ‘Lev is in Moscow’ is the antecedent, and ‘Irina is in Kiev’ is the consequent. These connectives play such a fundamental role that they have been given special symbols by logicians. The following are now standard: Connective

Symbol

not

¬

and

∧

or

∨

if…then

→

Because they operate on pairs of sentences to generate other sentences, ∧, ∨ and → are binary connectives; ¬ is unary, because it operates on single sentences. In what follows we shall refer, as just now, to ∧, ∨, and → directly as connectives rather than as connective symbols. In our model the basic items out of which its sentences are built are these connectives and a stock of capital letters A, B, C, D, etc., called sentence letters, from the beginning of the Roman alphabet. These are intended to represent some given set of English sentences with whose internal structure we are not concerned. The sentence letters are often called the atomic sentences of the model, because all its other sentences are compounded from them, using the connectives. The first level of composition consists of the negations, disjunctions, conjunctions of sentence letters, and conditionals formed from them. Each of these compounds is represented as follows: the negation of A by ¬A (¬ is prefixed to A, in contrast to the way ‘not’ is ordinarily embedded within a sentence to form its negation); the conjunction of A and B by A∧B; the disjunction of A and B by A∨B; and the conditional with antecedent A and consequent B by A→B. In a natural language there is no theoretical limit, though there are obviously practical ones, to the extent that sentences can be successively compounded together by means of connectives. Such a principle of composition will also operate in our model, to allow the sentences to be compounded ad infinitum using ¬, ∧, ∨ and →, generating symbol-strings like A→¬B, ¬(A→B), ((A→B)∨(B∧A))→C, etc. The brackets indicate which component

Logic with trees

6

sentences in each compound the various connectives connect. Denoting arbitrary sentences in the model by the letters X, Y, Z, etc., we can give a compact statement of the principle of composition in which the bracketing is automatically taken care of. The statement has two clauses, one unconditional, the other conditional: A, B, C, etc. are sentences, and if X and Y are sentences, then so also are ¬X, ¬Y, (X∨Y), (X∧Y), (X→Y) (in informal discussion the outer brackets will generally be dropped). In Chapter 3 we shall see that these two clauses determine for each sentence in the model a unique ancestral tree. 3 SEMANTICS: TRUTH-FUNCTIONALITY There are two other important elements in our model, the truth-values ‘true’ and ‘false’, which will be represented by the letters T and F. We shall make the important assumption that the truth-values of compound sentences depend on the truth-values of their component sentences; in particular, it will be assumed that the truth-values of ¬X, X∧Y, X∨Y and X→Y depend on those of X and Y and only on those. Call this assumption that of the truth-functionality of the connectives. Mostly, this assumption works quite well, though there are apparent exceptions which we shall investigate at length later in the book. It has the following consequence, on which the whole of truth-functional logic is based. Consider X∧Y. Its truth-value depends on those of X and Y; the truth-value of each of these, if it is not a sentence letter, depends on those of the sentences out of which it is immediately compounded; and so on backwards in the same way until we arrive at the sentence letters which are not compounded out of anything. In other words, the truth-value of any compound X in the model depends only on the truth-values of the sentence letters appearing in it. This consequence will be called the truth table principle, for the following reason. Let X be any compound. Suppose we arrange all the finitely many sentence letters, say A, B, C, etc., which appear in X, in a row, and write underneath all the possible distributions of truth-values to these in rows underneath A, B, C, etc. We then write X to the right of all the A, B, C, etc, giving a diagram that looks like this:

The truth-functionality assumption implies, as we have just seen, that each row of truthvalues on the left of the diagram determines a unique truth-value for X, which we shall write beneath X opposite the relevant row of truth-values on the left. The resulting table is called the truth table for X. The assignment of truth-values to sentence letters themselves is what model-builders call exogenous, determined outside the model. Our concern here is only with how those

The basics

7

truth-values, whatever they might be, determine the column of Ts and Fs beneath X in its truth table. The truth table principle tells us that this problem is solved once we have determined the truth tables for ¬ A, A∧B, A∨B, A→B. In the next section we shall make a start by determining the truth tables for ¬A and A∧B. We end this section on a philosophical note. There is a long-standing debate about whether sentences are truly bearers of truth-values, or whether only propositions can be (the usual definition of a proposition is that it is what is expressed by a sentence). While there is considerable disagreement, however, about exactly what sorts of things propositions actually are, there is absolutely no doubt that in everyday life the bearers of truth-values are the sorts of structured linguistic items described above, variously called statements or sentences; logicians tend to call them sentences. At any rate, it is with modelling these things that logicians have concerned themselves. And so shall we. 4 NEGATION AND CONJUNCTION If someone says to you that it is not the case that so and so, and you take what they say to be true, then this means that you take ‘so and so’ to be false. And if you take what they say to be false, this is because you take ‘so and so’ to be true. Putting these observations together in the model, we represent ‘so and so’ by the sentence letter A and construct the following truth table for ¬A:

In words: ¬A is true when A is false, and false when A is true. Similarly, the truth table for A∧B, i.e. ‘A and B’, is obtained by the same method of identifying the conditions under which we believe a conjunction to be true and those under which we believe it to be false. If you agree that A∧B is true, you are agreeing that A and B are both true, while if you think that A∧B is false, it is because you think that at least one of A and B is false. This immediately gives the truth table for ∧:

In words: A∧B is true just in case A and B are both true; otherwise it is false.

Logic with trees

8

Important note The truth-functionality assumption says that the truth-value of a conjunction or negation depends only on the truth-values of the sentences conjoined or negated, not on those sentences themselves. This means that, though the truth tables above are written for sentence letters A and B, they are equally valid when A and B are replaced by X and Y, i.e. by arbitrary sentences of the model language, compound or atomic. We could make this explicit by writing, as some do, the truth tables for negation and conjunction like this:

However, we shall continue to use the earlier tables, because their format is straightforwardly extended to the evaluation of any compound sentence, however complex. Exercises 1 If ¬A is true, what is the truth-value of A? 2 If A∧B is false and B is true, what is the truth-value of A? If you know merely that A∧B is false, does that tell you anything about the truth-value of A?

5 DISJUNCTION It is often claimed that there are two types of disjunction, or use of the word ‘or’, in English and other natural languages, inclusive and exclusive. To assert an exclusive disjunction (i.e. to claim implicitly that it is true) is to assert that one or other disjunct is true, but not both, while to assert an inclusive disjunction is to assert that at least one is true, and possibly both. There is certainly an inclusive use of ‘or’ in English; examples abound (here is one: ‘If you’re old or disabled nobody bothers with you’; we would all take the ‘old or disabled’ here to include any who are both old and disabled). By contrast, it is actually quite difficult to find a genuine use of exclusive ‘or’ which is not exclusive simply because the disjuncts are themselves exclusive, for example ‘He got either ten or twenty years; I can’t remember which.’ At any rate, logicians regard the inclusive ‘or’ as primary, and ∨ is accordingly given the truth table

In words:

The basics

9

A∨B is false only when A and B are both false, and otherwise true. Nothing is lost in apparently ignoring the exclusive disjunction, because as we shall see in the next section, it is already implicit in the connectives ¬, ∧ and ∨. Warning Words are notoriously not always what they seem. Consider the statement ‘You may have tea or you may have coffee’, which is not, as it appears to be, a disjunction but a conjunction: it actually says that you may have tea and you may have coffee (though it does not mean that you may have both). Exercises If A∨B is true and A is false, what is the truth-value of B?

6 TRUTH-FUNCTIONAL EQUIVALENCE We can use the truth table principle to evaluate arbitrary truth-functional compounds built up from sentence letters using connectives from the list ¬, ∧, ∨ and →. Consider, for example, the compound (A∨B)∧¬(A∧B). We first evaluate the inner conjunction A∧B against each row of the truth table, then the negation ¬(A∧B), and finally the conjunction (A∨B)∧¬(A∧B), as below. The truth-values of this final conjunction are listed in bold type in the central column of the truth table:

Not a very exciting compound, one might think. However, suppose we introduce a new binary connective xor (exclusive ‘or’; i.e. exclusive disjunction), whose truth table is

Inspection of the truth table for (A∨B)∧¬(A∧B) now reveals that it depends on the truthvalues of A and B in exactly the same way as does the truth-value of A xor B: for each row of the truth table the two compounds take the same truth-values. This is interesting for two reasons. First, it verifies the claim that exclusive ‘or’ is implicit in the connectives ¬, ∧ and ∨. So we do not need a special symbol like xor for exclusive disjunction: we could simply define the exclusive disjunction of A and B to be the

Logic with trees

10

compound (A∨B)∧¬(A∧B). Second, we have a new concept: truth-functional equivalence. A pair X, Y of compounds are said to be truth-functionally equivalent if, like A xor B and (A∨B)∧¬ (A∧¬B), X and Y take the same value at each row of the truth table generated by listing all distributions of truth-values over the set of all the sentence letters that appear in each compound. This set of sentence letters may be the same for both compounds, as it is above, but it may not. For example, A and A∧(B∨¬B) do not have the same set of sentence letters, but for all rows of the truth table generated by the four distributions of T and F over A and B, A and A∧(B∨¬B) take the same truth-values, and are therefore truthfunctionally equivalent. We shall use the notation X ⇔ Y to signify that X and Y are truth-functionally equivalent. (Note that ⇔ is not itself a connective.) As we shall see in the following chapters, the notions of truth-functional equivalence and its extension, firstorder equivalence, will turn out to be of fundamental importance. As an exercise in the truth table evaluation of compound sentences, we shall end this section by showing that (A→C)∧(B→C) and (A∨B)→C are truth-functionally equivalent:

Note There is no logical significance to the order in which the eight truth-value distributions over A, B and C are listed, though it is a good practical rule, as above, to start with all Ts, then all the ways (three) two Ts can be combined with one F, then all the ways (three) one T can be combined with two Fs, and then finish with all Fs. If a compound is built up from n distinct sentence letters, its truth table will have 2n rows, since there are two ways of assigning T or F to the first letter, and for each of these there will be two ways of assigning T or F to the second, and for each of these there will be two ways of assigning T or F to the third, and so on, giving 2.2.2. …, n times, which is equal to 2n.

The basics

11

Exercises 1 Construct truth tables for the following compounds: (i) (B∨C)∧(C∨B) (ii) ¬(A∧¬C)∨B (iii) ¬A∧(¬CvB) 2 Construct truth tables to show that (i) A∧(B∧C)(A⇔B)∧C and A∨(B∨C)⇔(A∨B)∨C. This property of A and ∨ is called associativity. (ii) A∧(B∨C)⇔(A∧B)∨(A∧C) and A∨(B∧C)⇔(A∨B)∧(A∨C). These are the so-called distributivity laws.

7 THE CONDITIONAL It is time to look at the final connective in the list of connectives drawn up in section 2, the conditional or arrow →, intended to symbolise the English ‘if…then—’. Imagine that you are listening to an old-fashioned melodrama, and at one point one of the protagonists exclaims ‘You will not reveal all, or I am lost!’ The substance of this assertion could equally well be conveyed, albeit more prosaically, by the conditional ‘If you reveal all then I am lost.’ This suggests that whatever English sentences A and B might represent, ¬A∨B and A→B should be merely different formulations of the same information, and hence be truth-functionally equivalent. Supposing this to be the case, the truth table for A→B is

because that, as the reader should check, is the truth table for ¬A∨B. In words: A→B is false when A is true and B false, and true in all other cases. (Readers who have some familiarity with logic programming will probably be more accustomed to seeing A→B written B←A, i.e. ‘B if A’). However , there seem to be other types of conditional statement in everyday life that are not expressed by →. One in particular seems to demand a definitely non-truth-functional analysis, and this is where the antecedent is counterfactual. For example, consider the sentence ‘If I had struck the match at that particular moment [t, say], a genie would have appeared’, where you did not in fact strike the match at that moment. Most people would regard this sentence as false, but if it is expressed as ‘I strike the match at moment t → a

Logic with trees

12

genie appears’, and evaluated by means of the truth table for →, then, given that the antecedent is false (counterfactual), the truth table for → makes the sentence true! This goes strongly against intuition, and to make matters worse, ‘If I had struck the match at that particular moment a genie would not have appeared’ also comes out true on the truth-functional reading using →, because the antecedent remains false in this sentence too. We shall postpone further discussion of counterfactuals to Chapter 12, where we shall also look at some challenges to the truth-functional reading of some non-counterfactual conditionals. Exercises 1 Construct truth tables to show that the following truth-functional equivalences hold:

2 Let å(A, B) be false only when A is false and B true. How would one express å(A, B) using only the connective →? 3 Verify that A→B has the same truth table as ¬A∨B and ¬(A∧¬B).

8 SOME OTHER CONNECTIVES, AND THE BICONDITIONAL Other connectives in common use in English, like ‘unless’, ‘but’ and ‘only if’, for example, can be more or less faithfully defined in terms of ¬, ∧, ∨ and →, and we shall deal with them in turn: But ‘I went to see the film but I didn’t like it’ says, from the point of view of simple truth and falsity and shorn of the nuance of regret, ‘I went to see the film and I didn’t like it.’ Logic is concerned with the way the truth-values of sentences depend on each other, and not with one’s feelings about what the sentences describe; so from the purely logical point of view, ‘but’ is ‘and’. Unless ‘You will not reach 100 unless you first reach 99 (years of age)’ plausibly means the same as ‘If you do not first reach 99 you will not reach 100’; so we shall take ‘A unless B’ to mean the same as ‘If not B then A’, represented in the model by ¬B→A. Only if ‘You will reach the age of 100 only if you first reach the age of 99’ means the same as ‘If you don’t first reach the age of 99 you won’t reach the age of 100’; so we take ‘A only if B’ to mean the same as ¬B→¬A. Now look at the truth table for ¬B→¬A:

The basics

13

But this is the truth table for A→B, which means that we can render ‘A only if B’ directly by A→B. If A→B is true then A is often called a sufficient condition for B, and B a necessary condition for A. There is one further connective which is often distinguished by being given a special symbol, even though it too will turn out to be definable in terms of other connectives among ∧, ∨, ¬ and →. This is the so-called biconditional ‘if and only if’, and it will be symbolised by ↔. It has its own symbol because statements of the form ‘A if and only if B’ crop up very frequently. However, the biconditional could also be defined in terms of A and →. ‘A if B’ is clearly B→A, and we have just seen that ‘A only if B’ has the same truth table as A→B. Hence, A↔B, ‘A if and only if B’, is truth-functionally equivalent to (A→B)∧(B→A), from which it follows that its truth table is evaluated as

i.e.: A↔B is true just in case A and B have the same truth-value. Exercises 1 You are in a country whose inhabitants randomly tell the truth or lie. You are trying to reach the capital city, which you know lies on the road you are following, but to your dismay the road forks. The capital is on one of the forks, but there is no signpost. A native of the country appears. A law of the country is that the natives are allowed only to answer ‘yes’ or ‘no’ to questions. How they will do so will depend on whether they are telling the truth or lying, but which they will do on any given occasion of course you simply don’t know (but you do know that they are very good at logic). All seems hopeless until you remember that, long ago, you attended a logic course. Suddenly you realise that there is a slightly complicated question you can ask this person, such that their answer will tell you for certain which fork the capital lies on. What is the question? (There is more than one, but the following method will certainly generate one. Consider a question of the form ‘Is X(A, B) true?’, where X(A, B) is a truthfunctional compound of A and B which you are familiar with, A is the sentence ‘You are lying’, and B is ‘The capital lies on the left fork.’ You want the native’s answer to be ‘yes’ if and only if B is true, and working back from this will identify X(A, B)—or, similarly, you may want to correlate the native’s ‘yes’ answer with B’s falsity; either

Logic with trees

14

way, once the native has answered, you’ll know for sure whether the capital lies on the left fork.) 2 Which connective among ∧, ∨, ¬, → and ↔ would you use to represent ‘just in case’ in the statement ‘A∧B is true just in case A and B are both true’? 3 Display the truth-functional structure of the assertions in (i) and (ii) below using the sentence letters indicated and the appropriate connectives among ∧, ∨, ¬ and → (omit the ‘Therefore’ in each case). (i) If wage-settlements continue at this high level (A) or they increase (B), and nothing is done to take money out of the economy (C), then inflation will continue to rise (D) and we shall be in serious trouble (E). Therefore if nothing is done to take money out of the economy we shall be in serious trouble. (ii) Tracey won’t return Also sprach Zarathustra (A) unless Wayne gives her the £5 he owes her (B), but Wayne will not do this without Rudolf giving him some of the money (C). Amaryllis doesn’t want Rudolf to do this (D), and if Amaryllis doesn’t want Rudolf to then Rudolf won’t. Carlos will only be able to do his homework (E) if Tracey returns Also sprach Zarathustra. Therefore Carlos won’t be able to do his homework. 4 Which connective among ∧, ∨, ¬ and → would you use to represent ‘whilst’ in the sentence ‘Whilst I believe in law and order, the actions of the police sometimes make me unhappy’? 5 Give one way of expressing the truth-functional form of ‘Untidy or inaccurate work will cost you marks.’ (Be careful!) 6 Is the sentence ‘Jill and Siân are the opposing team’ a conjunction of two sentences? 7 ‘If you are over 18 and married, then your name will go on the register unless you have already received benefit.’ How would you formalise this statement using the propositional connectives already given?

Chapter 2 Truth trees 1 TRUTH-FUNCTIONALLY VALID INFERENCE We now have a slightly better vantage point from which to investigate the inferences with which we started in Chapter 1: If Lev is in Moscow then Irina is in Kiev. Lev is in Moscow. ∴ Irina is in Kiev. Cain was hairy and Abel was his victim. ∴ Cain was hairy. It’s raining or it’s snowing. It’s not raining. ∴ It’s snowing. These have the respective truth-functional representations (A and B will obviously represent different sentences in each):

In each of (i)–(iii), consider what happens if we try to assume that the premises could be true and the conclusion false: (i) If B were to be false and A true, then the truth table tells us that A→B would be false. Thus we could not, on pain of contradiction, have true premises and a false conclusion. Therefore the inference is truth-functionally valid. It is usually referred to by its classical Latin name: modus ponens. (ii) is immediate. If A∧B is true then, from the truth table for ∧, both A and B are true. Hence in particular A must be true. (iii) is the disjunctive syllogism. Suppose that B is false, and that ¬A and A∨B are both true. Then we have a contradiction, for A must be false, and we have assumed that B is false, so that A∨B must be false, contrary to assumption. (i), (ii) and (iii) are therefore deductively valid inferences. They are deductively valid, moreover, by virtue of their truth-functional structure alone, and inferences which are valid by virtue of their truth-functional structure alone are called truth-functionally valid. More precisely, A truth-functionally valid inference is one whose premises and conclusion can

Logic with trees

16

be represented as truth-functional compounds built up from some set of sentence letters, such that there is no distribution of truth-values over those sentence letters which makes the premises all true and the conclusion false. Just as the conclusion of a deductively valid inference is said to be a deductive consequence of the premises, so the conclusion of a truth-functionally valid inference is said to be a truth-functional consequence of the premises. If there is a distribution of truth-values making all the premises true and the conclusion false, then that distribution is called a truth-functional counterexample to the inference. Hence: An inference is truth-functionally valid just in case there is no truth-functional counterexample to it. A very important consequence of the definition of truth-functionally valid inference is that there is an algorithm for deciding whether any given inference is truth-functionally valid or not (an algorithm is a ‘mechanical’ procedure which decides all members of a given class of problems in finite time, and is now usually regarded as a program which can be run on a suitably powerful computer). If there are n sentence letters in the sentences making up the inference, we know that there are 2n truth-value distributions over those sentence letters, and a truth table will evaluate all the sentences for each of these distributions. Then all we have to do is see whether one or more of those 2n distributions make all the premises true and the conclusion false. However, for even quite moderate values of n, 2n is a biggish number (220 is about a million). That search procedure is therefore exponentially complex, and it turns out that it contains in addition a lot of redundancy. For we learn nothing about the validity of the inference from examining the truth-value distributions which make either the premises false or the conclusion true: the only relevant distributions when considering deductive validity are clearly just those which make the premises true or the conclusion false. This is where the tree diagrams for the connectives come in so useful, for as we shall see in the next sections, they can be used to systematically eliminate the uninformative paths in the search for counter-examples. Exercises 1 Show that if X and Y are any two truth-functional compounds, then X⇔Y if and only if X and Y have exactly the same truth-functional consequences. 2 Show that X⇔Y if and only if X is a truth-functional consequence of Y and Y is a truth-functional consequence of X.

2 CONJUGATE TREE DIAGRAMS There is another way of displaying the information given in the truth tables for ∧, ∨, → and ↔, which as we shall see shortly generates a powerful and elegant method of proving truth-functional validity. This is to represent the truth tables for binary connectives by

Truth trees

17

tree diagrams, or more exactly by conjugate pairs of tree diagrams. Since the truth-value of, say, a conjunction depends only on the truth-values of its conjuncts, and not on the conjuncts themselves, we can write the truth table for the conjunction X∧Y of two compounds X and Y in the same way as if they were sentence letters:

Now consider the following pair of diagrams:

For the time being, regard the boldface T’s and F’s as integral parts of each diagram (T and F are printed in bold type to distinguish them from the letters X and Y). Read upwards, the left-hand diagram can be interpreted as saying that when X and Y are both true (T), so is X∧Y, and the right-hand diagram can be interpreted as saying that if X is false (F), whatever the truth-value of Y, then so is X∧Y, and that if Y is false, whatever the truth-value of X, then so is X∧Y. So interpreted (and we shall see later that the diagrams can be read both upwards and downwards), the left-hand diagram represents the one row of the truth table for X∧Y in which X∧Y takes the value T, and the right-hand diagram represents the three rows for which X∧Y takes the value F. In other words, the pair of diagrams contains exactly the information contained in the truth table for X∧Y. The diagrams are called signed tree diagrams because they are tagged (or signed) with truth-values. We can eliminate these tags in the following way. Noting that when any sentence Z is false its negation ¬Z is true, we can rewrite the second diagram as

Now both the left- and the right-hand diagrams are signed only with Ts, and that being the case we can regard the Ts as understood and omit explicit mention of them. So we are left with a pair of unsigned diagrams, the unsigned conjugate tree diagrams for ∧:

Logic with trees

18

We could equally accurately have represented the F rows of the truth table for X∧Y by an unsigned diagram with three branches:

where each branch represents one row of the truth table for which X∧Y is F. However, the function of conjugate diagrams is not simply, or even primarily, to represent truth tables; their principal function is to provide rules of inference, for what will be called tree proofs, and for them to perform this role efficiently the number of branchings from any sentence is best limited to at most two. We shall see very shortly how these diagrams can then be put together to form elegant and virtually mechanical proofs. The left-hand diagram in a conjugate pair will always represent the T rows of the truth table for the relevant connective, and the right-hand diagram the F rows. With this in mind we turn to the truth table for ∨:

Reading from the table, we see that if X is true then X∨Y is true, whatever the truth-value of Y, and if Y is true then X∨Y is true, whatever the truth-value of X. This gives us the left-hand (signed) diagram for ∨:

There is just one row where X∨Y takes the value F, and that is the row at which both X and Y take the value F. This gives us the right-hand signed diagram for ∨:

Truth trees

19

Using the same unsigning procedure as for ∧, we obtain the pair of unsigned conjugate diagrams for ∨:

Notice the mirror symmetry between the pair of diagrams for ∧ and that for ∨: if we change ∨ to ∧ in the diagrams for ∨, and negate each sentence, eliminating all double ¬’s, we obtain the diagrams for ∧; and vice versa. This is a special case of a principle called the Duality Principle, which we shall return to in Chapter 3. The same procedure as that used to obtain the diagrams for ∧ and ∨ yields the conjugate unsigned diagrams for → and ↔:

Where the lower sentences in these diagrams occur in pairs it is customary to write one member of the pair above the other, as above. No priority is implied in this listing, and it can be reversed without changing the diagram characteristics, as we shall see. The reader may be wondering what has happened to ¬. It too is a truth-functional connective, defined by a truth table, and consequently it too will have a pair of unsigned diagrams representing its truth table. Since its properties have already been implicitly used in converting signed diagrams into unsigned ones, it might seem that the unsigned diagrams for negation itself are unlikely to be very informative. To some extent this is true, but by no means entirely. The signed diagrams are these:

Logic with trees

20

yielding the unsigned diagrams:

Clearly, the left-hand diagram of this pair is totally uninformative. This leaves us with only one non-trivial diagram, the right-hand one. Though a single diagram, it will turn out to be very useful, so useful that it is given a special name, the Rule of Double Negation. Exercises 1 Construct the truth tables for the sentence Z represented in (a) and (b) below by the left-hand unsigned member of each of a pair of unsigned conjugate diagrams, and express Z as a truth-functional compound of X and Y, employing any of the connectives ∧, ∨, ¬ and → (remember that the left-hand diagram always represents the T rows of the table for Z).

(a)

(b) 2 Construct the truth table for the compound which these conjugate diagrams determine, and rewrite the right-hand diagram so that it satisfies the condition that no diagram has more than two branches:

3 TRUTH TREES The conjugate diagrams of the previous section (together with the Rule of Double Negation) can be used to yield simple graphic demonstrations of truth-functional validity. We shall start with some very simple examples, the inferences (i)–(iii) of section 1. First (i), modus ponens:

Truth trees

21

Asking whether it is possible for the premises A→B, A to take the truth-value T and the conclusion, B, the value F is, of course, equivalent to asking whether A→B, A, and ¬B can all take the value T. Write down A→B, A, ¬B, signifying that we are supposing, for the sake of argument, that they’re all true:

These will be called the initial sentences. The order in which they are listed is immaterial: they’re simply a set of three sentences assumed, for the moment, to be all true. Now write underneath the initial sentences the lower part of the tree diagram for A→B:

We now have a small upside-down tree:

If we define a branch in the tree to be a continuous path up from the terminal lower sentences to the topmost sentence of the tree, then the tree above has two branches, one carrying the sentences A→B, A, ¬B, ¬A, and the other A→B, A, ¬B, B. This tree is called a truth tree. The information it contains is probably more immediately conveyed by the equivalent signed version:

On the left-hand branch we see AT and AF, and on the right-hand branch BT and BF; neither is a consistent assignment of truth-values. As those two branches represent all the possibilities admitted by the initial assignment, this signed tree shows that the assumption

Logic with trees

22

that A→B and A are true and B false is impossible: it leads to a contradictory assignment of T and F either to B or to A. The original tree also showed this, if less explicitly, by having both A and ¬A on one branch, and B and ¬B on the other. At any rate, we infer that there can be no truth-functional counterexample to the inference; i.e. no distribution of truth-values to A and B which makes the premises A→B and A both true and the conclusion B false. In other words, the inference A, A→B ∴B is truth-functionally valid. Of course, we already knew that, by means of an argument in section 1 that in some ways resembles an informal version of the one above. But the tree format has the advantage over the informal argument that it extends to a simple and mechanical method for deciding the validity or otherwise of any truth-functional inference whatever, no matter how complex. Now for (ii):

Here the sole premise is A∧B, and the conclusion is A, and we shall again use a tree— they will all be unsigned from now on—to show that the assignment of T to A∧B and F to A, i.e. of T to both A∧B and ¬A, is impossible. This time we get a tree with only one branch

on which the pair A, ¬A appear. These cannot both be true, so again we conclude that our assumption, that a truth-value distribution over A, B exists which satisfies both A∧B and ¬A, leads to a contradiction. Hence the inference from A∧B to A is truth-functionally valid. Finally, (iii). Again, we write down the premises and negation of the conclusion, implicitly assuming them all true:

We see immediately that we have A and ¬A on one branch, and B and ¬B on the other. Since these branches exhaust the possible ways in which the initial sentences can be true together, we infer that those initial sentences cannot all be true together, and so (iii) is

Truth trees

23

truth-functionally valid. This all seems very promising. Let us see how it fares with a slightly more complex inference: If the mark rises or the yen rises the dollar will fall. The dollar will not fall. ∴ It is not the case that the mark rises and the yen rises. This has the truth-functional form

As in the previous examples, we list the premises and the negated conclusion, so:

and, starting from this set of initial sentences, we shall use the conjugate diagrams for the connectives to decompose the compound sentences into simpler ones until the tree generated in the process finally tells us, by depositing a sentence and its negation on every branch, that there is no way all the initial sentences can be true. The final sentence on the list of initial sentences is the double negation ¬¬(A∧B), which seems a good enough place to start. Applying the rule of double negation to ¬¬ (A∧B), we extend a branch downwards as follows, numbering the lines as we go:

In the terminology we shall use from now on, ¬¬(A∧B) has been used. For book-keeping purposes we can indicate this by placing a tick beside it—we don’t want inadvertently to use it twice. Proceeding downwards, we can now write the diagram for A∧B directly beneath A∧B:

Logic with trees

24

Now what? We now look around for another compound sentence to use by writing the tree diagram for it under the lowest point of the tree. There is only one such sentence, the topmost, the conditional (A∨B)→C. This has antecedent A∨B and consequent C. Directly beneath B on line 6 write the diagram for the conditional with that antecedent and consequent:

We have now used (A∨B)→C, and in so doing extended the tree to one with two branches. The right-hand branch, terminating in C, contains the negation ¬C of C as well as C, and therefore (visualise the tree signed with truth-values) represents an impossible truth-value assignment. Accordingly, we shall close that branch by writing a line beneath it to signify that it must not be continued. As soon as any two sentences occur on a branch, one of which is the negation of the other, that branch is closed. No further attention is paid to closed branches; they merely represent failed attempts to make the initial sentences all true. The remaining unclosed branches on any tree are said to be open. There is now one open branch on the tree above, the left-hand one. It also contains a compound sentence for which there is a tree diagram, namely ¬(A∨B). Accordingly we continue the left-hand branch by writing the tree diagram for ¬(A∨B) beneath ¬(A∨B):

Truth trees

25

Now all the branches have closed. When this happens the tree itself is said to close, a phenomenon we have now learned to interpret as meaning that no distribution of truthvalues at all over A, B and C will make the initial sentences jointly true. Therefore the inference we started with is truth-functionally valid: no distribution of truth-values over its sentence letters makes the premises true and the conclusion false. It makes little difference in what order we use the sentences on a tree. In particular, it makes no difference to whether the tree closes or not (we shall give a rigorous proof of this in Chapter 4). We can check that this is so in the example above by constructing a tree in which the sentence (A∨B)→C is used first:

Important note In the event of there being more than one open branch at the point at which a sentence is used, place the tree diagram for that sentence at the end of every such branch. Bearing in mind that the tree diagrams give the truth-conditions for their respective connectives, it’s not difficult to see why this should be done: each open branch of the tree represents a different way in which the sentences on it which have already

Logic with trees

26

been used can be jointly true. Each time a further sentence is used, that represents a further constraint, and one which has to be applied uniformly to all open continuations of that branch. Hence when a sentence is used in a tree, its diagram must be placed on every open branch. We have observed that we can interpret a closed tree as indicating that there is no distribution of truth-values over its sentence letters which will make the initial sentences all true. What if we eventually use all the usable sentences on the tree and it doesn’t close? The tree in these circumstances is said to be finished and open. Can we infer that there is a truth-value distribution over the sentences letters that will make all the initial sentences true? The answer is ‘yes’. Informally, the argument is as follows: each open branch of the tree represents a way in which the sentences on it which have been used can be true. If there is no further sentence to be used and there are still open branches, this means that there are ways in which all the sentences on those branches, including the initial sentences, can all be true. To sum up: Truth trees constructed from a set of initial sentences tell us whether there is a truth-value distribution over their sentence letters which will make the initial sentences true. If there is no such distribution, the tree will close. If there is one. it won’t. So far we have admittedly given only rather intuitive arguments for these claims, especially the last, but eventually we shall be in a position to give a more rigorous one. We shall end this section with a resumé of how to build trees. First, write the initial sentences (when testing inferences, these are the premises and the negation of the conclusion) in any order. Then select what looks like the simplest compound sentence and write its diagram under the initial sentences. That sentence is now used. Now choose another compound and write its diagram under the last sentence on each open branch if there is more than one. Continue doing this, closing every branch on which appears a sentence and its negation, until the tree itself either closes or else terminates without closing. Procedural remark Neither numbering the lines of a tree nor ticking branches is an essential part of tree construction; for extended trees these are useful devices, for simple ones usually unnecessary. Historical note Despite the fact that tree proofs of validity are a relatively modern development, they are really just a way of representing what logicians have traditionally called reductio ad absurdum arguments. In these you assume as true the premises and also the negation of the conclusion which is alleged follows from them, and you try to deduce a contradiction (‘reduce them to absurdity’). If you succeed, that shows that the negation of the conclusion is inconsistent with the premises. The tree construction is a powerful modern systematisation of this ancient method of proof. Exercises 1 Formalise the following inferences using sentence letters and the appropriate

Truth trees

27

connectives, and construct truth trees to show that each is truth-functionally valid. (i) The butler did it or the gardener did it. The gardener did not do it. ∴ The butler did it. (ii) The butler did it or the gardener did it. ∴ If the butler did not do it then the gardener did. (iii) If the butler did not do it then the gardener did. ∴ The butler did it or the gardener did it. (iv) If the butler did it then the gardener did not. ∴ If the gardener did it then the butler did not. (v) ‘If naive realism is true then naive realism is false [not true]. ∴ Naive realism is false’ (Bertrand Russell). (vi) If the government raises interest rates then there will be inflation. If the government does not raise interest rates then there will be inflation. ∴ There will be inflation. 2 Suppose you have constructed a tree to test the validity of an inference, and all the branches close with one of the initial sentences still unused. What does this tell you (i) if the unused sentence is a premise, and (ii) if the unused sentence is the negated conclusion?

4 TAUTOLOGIES AND CONTRADICTIONS Consider the compound A∨¬A. This takes the truth-value T whatever the truth-value of A might be. So does ¬(A∧¬A), while A→(B→A) and (A∧¬A)→B take the value T whatever the truth-values of each of A and B might be. Compounds like these which take the value T for all distributions of truth-values over their sentence letters are called tautologies. Compounds which take the value F for all values of their sentence letters are called contradictions. We can immediately infer that the negation of a tautology is a contradiction, and the negation of a contradiction is a tautology. We can use truth trees to test for tautologousness and for contradictoriness. No truthvalue distribution over sentence letters can make a contradiction true, which means that if we construct a truth tree from it then eventually the tree will close. Conversely, if the tree closes then there is no truth-value distribution over sentences letters that makes the initial sentence true, and it is therefore a contradiction. Example

Logic with trees

28

i.e. ¬(A∧B)→¬A is a contradiction. We can now also construct a tree test for tautologousness. For if a sentence is a tautology then its negation is a contradiction, which means that if we construct a tree from its negation the tree will close. In other words, X is a tautology if and only if a tree generated from ¬X closes. Beginners are always tempted to say that a compound is a tautology just in case a tree generated from it has only open branches. This is definitely incorrect, though it is left as an exercise to say why. Example (A→(B→C))→((A→B)→(A→C)). This is a tautology, as the following closed tree establishes (supply the justification for each line):

Finally, we can test for whether a compound is neither a tautology nor a contradiction. Construct a tree from ¬X, and another from X. Neither tree closes if and only if X is neither a tautology nor a contradiction. A tautology of the form X↔Y is called a tautological biconditional. There is an intimate relationship between tautological biconditionality and truth-functional equivalence: where X and Y are any truth-functional compounds, X⇔Y if and only if the compound (X↔Y) is a tautology (this is easy to show, and is left as exercise 1 below). This means that we can use a tree test for truth-functional equivalence, for we know how to use trees to test for tautologousness. Thus we can decide whether X⇔Y by seeing whether a tree generated from ¬(X↔Y) closes; if it does, X⇔Y. We can even shorten the procedure as follows. The tree diagram for a negated biconditional is this:

Truth trees

29

A tree generated from ¬(X↔Y) will therefore close if and only if the two trees generated by the pairs {X, ¬Y} and {Y, ¬X} of initial sentences both close. Hence X⇔Y if and only if trees generated from both pairs {X, ¬Y} and {Y, ¬X} of initial sentences close. Note that this result also tells us that X⇔Y if and only if X is a truth-functional consequence of Y and Y is a truth-functional consequence of X. Exercises 1 Let X and Y be truth-functional compounds. Explain why X⇔Y if and only if X↔Y is a tautology. 2 Show that any sentence is a truth-functional consequence of a contradiction, and that a tautology is a truth-functional consequence of any sentence. 3 Of the following, state which are tautologies, contradictions or neither, and construct trees to justify your statements. (i) A→(B→A) (ii) A→(¬A→B) (iii) A∧¬(A∨B) (iv) (A∨B)→B (v) (A∧B)→A (vi) A→¬A (vii) ¬(A→A) (viii) ¬(A→A) (ix) A∧¬(B→A) 4 Suppose a tree generated by a single sentence X has only open branches. Does this mean that X is a tautology? 5 Of the following pairs of sentences, state which are truth-functionally equivalent to each other, and construct trees to justify your statements. (a)

A→(B→C)

(A∧B)→C

(b)

A→B

¬A→¬B

(c)

(C∨A)∧(B∨A)

(C∧B)∨A

(d)

A∧C

¬(A→¬C)

(e)

(A→B)∧(C→B)

(A∨C)→B

(f)

(A→B)∧(A→C)

A→(B∨C)

(g)

(A∧¬A)∨(B∨C)

B∨C

(h)

A∨(B→D)

¬B∨(A∨D)

(i)

¬(A↔B)

(A∧¬B)∨(B∧¬A)

(j)

A∧¬A

B∧¬B

6 In what follows let T be an arbitrary tautology and ⊥ an arbitrary contradiction, as above. Let X be any truth-functional compound. With reference to appropriate trees,

Logic with trees

explain why

30

Chapter 3 Propositional languages 1 PROPOSITIONAL LANGUAGES In Chapter 1 we introduced the notion of a propositional language, as a formal model of the class of truth-functional compounds which can be generated from some initially given set of sentences, using some specified set of connectives. These formal languages are of interest for two reasons: (i) they form the framework for the detailed development of truth trees in Chapter 4; and (ii) a characteristic feature of their syntactic structure allows a powerful method, called inductive proof, to be used to investigate their properties. Nearly all the results of this chapter will be directly or indirectly obtained by this method. However, if the reader wants to continue the development of truth trees where the last chapter left off, they can skip this chapter and proceed directly to Chapter 4, where enough about propositional languages to make the discussion intelligible will be explained at the outset. These preliminaries over, let S be a set of sentence letters and Π some set of connectives. Let L[S; Π] denote the set of all the truth-functional compounds which can be constructed using sentence letters from S and connectives from Π. L[S, Π] is called the propositional language generated by the sentence letters in S and the connectives in Π. We shall often refer to an arbitrary propositional language simply as L. Following the notational convention introduced in Chapter 1 we shall use capitals…, X, Y, Z from the end of the Roman alphabet to refer to arbitrary sentences of L. When the members of either S or Π are explicitly displayed, as they are in L[{A, B};{¬, ∧}], for example, we shall omit the set brackets {, } and simply write L[A, B; ∧, ¬}. S can be either finite or infinite, but Π will be assumed to be finite; it does not have to be the set {∧, ∨, ¬, →, ↔}, or even include any members of it. If we want to define a new five-place connective © where ©(A, B, C, D, E) is true, say, when A and D are true, and false in all other 25 −23=24 cases, then there is a perfectly respectable propositional language whose set of connectives contains ©. For all choices of Π except the empty set the number of distinct sentences in L[S; Π] is infinite, even when S contains just one letter. For example, suppose S is just {A} and Π is just {¬}. Then all of ¬A, ¬¬A, ¬¬¬A, ¬¬¬¬A, etc. are in L. The reader should keep in mind that though A and ¬¬A are truthfunctionally equivalent, they are nevertheless different sentences, A has no occurences of ¬ while ¬¬A has two. Exercises 1 How many sentences are there in (i) L[A, B; Ø] (Ø is the standard symbol for the empty set; in other words, there are no connectives in this language), (ii) L[A; ¬], and (iii) L[A; →]?

Logic with trees

32

2 Is A↔B a sentence in L[A, B; ∧, ∨, ¬, →]? If not, does this mean that there is no biconditional sentence in L? 3 Show that there are infinitely many sentences in L[A, B; ∨, ¬] truth-functionally equivalent to A→B.

2 OBJECT-LANGUAGE AND METALANGUAGE Throughout this book we shall be using more or less standard English to discuss the structure of the formally generated entities intended to be the symbolic representations of English (or any natural-language) sentences themselves. The language in which these structures and their relationships are discussed—in this case English—is called the metalanguage. The formal languages, like the propositional languages we have just introduced, whose structure is under discussion in the metalanguage, are usually called object-languages. Where, as here, a special symbolism is used for the object-language, the meta-/objectlevel distinction is fairly explicit. However, sometimes special symbols are used also in the metalanguage. For example, ⇔ is used in this book as a metalinguistic symbol, denoting a relation between objectlanguage sentences. The letters …, X, Y, Z used to refer to ‘arbitrary’ sentences in a propositional language L are also metalinguistic objects; their correct classification is metalinguistic variables, ranging over a domain consisting of all the sentences of L. The letter L itself is also a metalinguistic variable, ranging over a domain consisting of propositional languages. In saying that English is used as a metalanguage for the discussion of a propositional object-language, we are making a meta-meta-level assertion, since we are now discussing the relationship of ordinary English itself to its object-language(s). Yet we are doing so in ordinary English! In other words, one and the same language—ordinary English in this case—is made to operate on more than one level. But this is true to a great extent even as regards the metalanguage of the object-languages we shall discuss in this book and those object-languages themselves. The reader will soon note, if they haven’t already, that in our metalinguistic discussion we are using terms like ‘and’, ‘or’, ‘not’, ‘if…then…’ and ‘if and only if’, which are, in symbolic form, part of the propositional languages described. There is nothing wrong with this; after all, children are taught the structure of their own language using that language. The results we shall establish about the properties of propositional and later first-order languages, and the system of formal deduction each will be associated with, are sometimes called metatheorems, because they are established using (usually rather informal) reasoning within the metalanguage. This metalinguistic reasoning itself is often called metalogical, or metatheoretic. Most of what appears in textbooks of logic is in fact metalogic, since its object is to discuss properties of formal systems. 3 ANCESTRAL TREES Consider the propositional language L[A, B, C,…; ¬, ∧, ∨, →]. We observed in Chapter 1

Propositional languages

33

that its class of sentences can be specified in a very compact way. Call any finite string of symbols drawn from the list enclosed in square brackets in L[…] an expression of L. Then an expression of L is a sentence of L if and only if it is (i) a sentence letter of L, or (ii) if it is of the form—Y, (Y∨Z), (Y∧Z), (Y→Z) where Y and Z are sentences of L. These clauses define a class of objects—sentences of L—by stating that certain things (sentence letters) are in that class unconditionally, and that certain other things are in it conditionally on their being built in a specific way out of things already registered as members. A definition which specifies a class in terms of this sort of absolute-plusconditional membership criterion is called an inductive definition. The ability of a class of entities to be defined inductively is a highly prized characteristic, because classes so defined obey a related principle of induction which can be used to elicit a range of other interesting properties shared by their members; more on this shortly. Y is called the immediate predecessor of the sentence ¬Y, and Y and Z the immediate predecessors of each of the sentences Y∧Z, Y∨Z, Y→Z. Sentence letters are decreed to have no immediate predecessors. We can depict the immediate predecessor relation graphically as follows:

where Y in the first diagram and Y and Z in the second are the immediate predecessors of X. Do not confuse these diagrams with the conjugate tree diagrams for the connectives: the latter specify truth-conditions, whereas those above convey purely structural information. If either Y or Z in the diagrams above is not a sentence letter we can continue the diagrams downwards, to include their immediate predecessors, and so on until we eventually reach sentence letters which, because they have no immediate predecessors, are terminal points of the resulting tree. By analogy with human genealogy we shall call this tree the ancestral tree of X. Here is the ancestral tree of the sentence B→(B∨(A∧¬C)):

The nodes on this tree (i.e. the junction-points of line-segments, including the initial and

Logic with trees

34

terminal points) below X are all predecessors of X; another term for these is proper subsentences of X. Thus the ancestral tree is also the subsentence tree of X. Exercises Draw the ancestral trees of the following sentences: (a) A∨¬(A→(B∨C)) (b) ¬(C∨D)∧(D∧¬C) (c) A→(B→¬(C∧¬D))

4 AN INDUCTION PRINCIPLE The sentences in any propositional language all have ancestral trees (the ancestral tree of a sentence letter A is just the single node consisting of A), a fact which is exploited in a very useful method of proving general results about these languages called the Principle of Induction on Immediate Predecessors. This says the following. Suppose P is any property which it makes sense to speak of sentences of L having or not having. Then: If (1) all the sentence letters of L have P, and (2) from the assumption that the immediate predecessors of any non-atomic sentence X in L have P, it follows that so too does X, then (3) every sentence in L has P. To see why the principle is true, suppose L is any propositional language including the sentence letters A, B, C and the connectives ∧, ∨, ¬. Let X be B→(B∨¬(A∧¬C)) for example. Look at its ancestral tree given on p. 34 above. Suppose that assumptions (1) and (2) above are satisfied. (1) tells us that A, B and C have P, and (2) tells us that we can think of a single line-segment or pair of line-segments in the ancestral tree as carrying possession of P upwards from those lower nodes to their successor node. But since, by (1), each of the terminal nodes B, B, A and C (from left to right in the tree) have P, it follows that possession of P is carried from successive level to successive level up the tree, until it is finally inherited by the topmost node X itself. Clearly, the same argument will establish that any given sentence of L has P. The Induction Principle is really no more than a roundabout way of saying that every sentence in a propositional language has an ancestral tree. But it is very useful, and we shall show by means of some examples how the principle can be used as a powerful (meta)proof-technique. We shall start with a very simple example. Consider the language L[A; ∧]. We shall show, by induction, that no sentence in L is a tautology. Let the property P in the statement of the Induction Principle be is not a tautology. First, we have to show that (1) in the statement of the principle is satisfied. This means that we have to show that every sentence letter of L is not a tautology. There is only one sentence letter in L, A, and A is not a tautology because it has the truth table

Propositional languages

35

which obviously contains at least one F; exactly one, in fact. Now we must show that (2) is satisfied. This is traditionally called the induction step. Suppose X is any sentence of L other than a sentence letter, i.e. other than A. X must therefore contain at least one occurrence of ∧, and hence be of the form Y∧Z, for some pair of sentences Y, Z of L. X’s immediate predecessors are therefore Y and Z, and we have to show that if both these have P, then so does X. Well, suppose that Y and Z have P, i.e. suppose that neither Y nor Z is a tautology. In that case, both Y and Z will have at least one F in their truth table, from which it follows that Y∧Z must have at least one F too, since we know that a conjunction is false if either conjunct is. Hence Y∧Z, i.e. X, is not a tautology. So we have shown that if X’s immediate predecessors have P, so too does X, and we can invoke the Induction Principle to infer that all sentences in L have P, i.e. are not tautologies. Q.E.D. (Q.E.D. stands for ‘quod erat demonstrandum’, i.e. ‘which was to be demonstrated’. This is the phrase, translated into Latin from the original Greek, with which Euclid signalled the end of a demonstration in his celebrated Elements. The three letters, boldface, will be used to perform the same function throughout this book.) To sum up: a proof by induction on immediate predecessors proceeds in the following three stages: (1) establish that all the sentence letters in L have P, the property in question; and (2) show that from the assumption that the immediate predecessors of some arbitrary sentence Z have P (this provisional assumption is called the inductive hypothesis), it follows that Z too must have P. If stages (1) and (2) are both successfully accomplished, then we invoke the Principle of Induction to conclude that (3) all the sentences of L have P. We shall now look at another proof by induction, in which the induction step is a socalled proof by cases. This time L is the language L[A; →], i.e. the set of all sentences which can be constructed from A and the single connective →. Let P now be the property of either being a tautology or being truth-functionally equivalent to A. (1) We first have to show that all sentence letters of L have P. Since there is only one, A, this amounts to showing that A has P. But obviously A has P, since A ⇔ A. (2) We now have to show for every non-atomic X in L, that if X’s immediate predecessors have P, then so does X itself. If X is non-atomic, this means that X must be of the form Y→Z for some sentences Y and Z of L. Y and Z are X’s immediate predecessors, so we now have to assume that both Y and Z have P, and see whether from that assumption we can show that X must have P. So let us assume that Y is equivalent to A or Y is a tautology, and the same for Z. This is our inductive

Logic with trees

36

hypothesis, and it implies that there are four exclusive and exhaustive possibilities to consider: (i) Y is a tautology and Z is a tautology; or (ii) Y is a tautology and Z ⇔ A; or (iii) Y ⇔ A and Z is a tautology; or finally (iv) Y ⇔ A and Z ⇔ A. We shall now consider these in turn. (i) We here have the truth table

Clearly, X=Y→Z is a tautology, and therefore X has P. (ii) Now we have the truth table

Here X=Y→Z is obviously equivalent to A, and so in this case too X has P. (iii) This gives the truth table

in which case X=Y→Z is a tautology. Hence in this case X has P. (iv) Here the truth table is

and so in this case X is a tautology, and so has P. We have just proved that in each of the four different possible cases, if Y and Z have P, so does X. The Principle of Induction on Immediate Predecessors now permits us to conclude that all sentences in L are either equivalent to A or are tautologies, and step (3) is accomplished. Q.E.D.

Propositional languages

37

Exercises 1 What are the immediate predecessors of the following sentences? (a) (A∧B)∨(B∧A) (b) (A∧B)∨¬B (c) A (d) A→(B→C) (e) ¬(A→(B→C)) 2 Show by induction on immediate predecessors that every sentence in L[A; ∧] and every sentence in L[A; ∨] is truth-functionally equivalent to A. 3 Show by induction on immediate predecessors that if X is any sentence in L[A; ↔] then X is either a tautology or is truth-functionally equivalent to A. 4* (Replacement Principle) Suppose X, U, V are sentences in L[A, B, C,…; ∧, ∨, ¬, →]. Define X(V/U) as follows. If U is a sub-sentence of X (i.e. U is a node in X’s ancestral tree), X(V/U) is the result of substituting V for every occurrence of U in X; if U is not a subsentence of X, X(V/U)=X. Show by induction on immediate predecessors that if U⇔V then X⇔X(V/U).

A DIGRESSION ON MATHEMATICAL INDUCTION (This can be skipped by those with an aversion to numbers.) Readers acquainted with the Principle of Mathematical Induction will find something very familiar about that of induction on immediate predecessors. This is as it should be, because at bottom both enunciate one and the same principle. The Principle of Mathematical Induction on the set N = {0, 1, 2, 3,…} of natural numbers says that if P is a property and (1) 0 has P and (2) whenever m in N has P so does m+1, then all natural numbers have P (there is an analogous principle for the positive integers Z+={1, 2, 3,…}, only (1) becomes the condition that 1 has P). Now m is the (sole) immediate predecessor of m+1, and so we can restate clause (2) as…(2'): whenever the immediate predecessor of any non-zero number n has P so does n…, and we have something which differs from the Principle of Induction on Immediate Predecessors only in the fact that truth-functional sentences can have multiple immediate predecessors, whereas each positive integer has only one. To put it another way, the ancestral tree of a sentence may and usually will branch, whereas the ancestral tree of a positive integer is a line. However, there are propositional languages whose structure is formally identical (the technical term is isomorphic) to that of the natural numbers with their successor/ predecessor structure. For example, L=L[A; ¬], where A corresponds to 0 and passing from X to ¬X in L corresponds to passing from n to n +1 in N. For those who have not seen a proof by mathematical induction before, here is a wellknown and simple one. We want to show that 1+2+3 + …+n=n(n+1)/2, for all n in Z+ (the set of positive integers). Let Sn stand for the sum of the first n positive integers, i.e. 1+2+…+n, and f(n) for the function n(n+1)/2. Let n have the property P just in case Sn =f

Logic with trees

38

(n). It is easy to show by mathematical induction that every n in Z+ has P. First we need to check that 1 has P. This is immediate, since clearly S1=1=f(1). Now for the induction step: we shall assume that m has P for some arbitrary m, and then show that m+1 has P. So we suppose that Sm=f(m). Adding m+1 to both sides, we infer that 1+2+…+ m+(m+1) =Sm+ 1=f(m)+(m+1)=(m(m+1)12)+m+1=(m (m+1)+2(m+1))/2=(m+1)(m+2)12=f(m+1). Thus from m’s having P we infer that m+1 has P, and the induction step is proved. Hence we infer that for all n in Z+, n has P. Q.E.D. 5 MULTIPLE CONJUNCTIONS AND DISJUNCTIONS In this section we shall use the Principle of Induction on Immediate Predecessors to prove a very useful metaresult about truth-functional compounds which involve only the connective ∧, and compounds which only involve ∨. As a preamble we can verify by truth tables that the following two equivalences hold for any sentences X, Y and Z in any propositional language containing ∧ and ∨ (cf. exercise 1, Chapter 1, section 4):

These two conditions amount to saying that v and A are associative: it does not matter where you put the brackets in a conjunction or disjunction of X, Y and Z: X∧(Y∧Z) is true just when all of X, Y and Z are true, and X∨(Y∨Z) is false just when all of X, Y and Z are false. For this reason the sentences above are usually written simply as X∧Y∧Z and X∨Y∨Z respectively. We shall now generalise this result. Let the set S in L[S; ∧] contain n sentence letters, which we shall write A1,…, An (these are metalinguistic symbols used simply to signify that S contains n sentence letters; no ordering of those letters is implied), and let X be a sentence in L[S, ∧], i.e. in L[A1,…, An; ∧]. Thus X is obtained by conjoining some or all of the sentence letters in S in any order, possibly with repetitions. Using the Principle of Induction on Immediate Predecessors we shall now show that for all X in L, X takes the value T in its truth table just when all the Aj in X take the value T. Let P be the property a sentence X in L has when the following condition is satisfied: X takes the value T when and only when all the Ai in X take the value T. We shall show (1) that all the Ai have P, and (2) that for non-atomic X, if X’s immediate predecessors have P then so does X itself. (1) is immediate. For to say that Aj has P is to say that Aj is T when and only when Aj is T, which is itself (trivially) true. Now for the induction step (2). Suppose that X is not an Ai Then X must be of the form Y∧Z, with immediate predecessors Y and Z for some sentences Y and Z in L. Let us suppose (the inductive hypothesis) that Y and Z each have P, i.e. they are T just when their component Ai are all T. We have to establish that the inductive hypothesis implies that X is T if and only if all its constituent sentence letters are T. (i) Suppose X is T. Then Y and Z must both be T. Hence, by the inductive hypothesis, all the sentence letters in Y and Z are all T. But these are just the sentence letters in X itself. (ii) Suppose all the sentence letters in X are T. Hence all the sentence

Propositional languages

39

letters in Y and in Z are all T. Hence, by the inductive hypothesis, Y and Z are both T. Hence X is T. So the induction step (2) is established and the result follows. Q.E.D. This result tells us that any two sentences in L containing the same sentence letters are truth-functionally equivalent. A corollary is that we do not need to introduce a new nplace connective to represent the truth function of A1,…, An which is true just when all the Ai are true; any compound of the Aj built up using just the binary connective ∧ has the same truth table, namely T when all the Ai are T, and F in every other case. A corollary is that if X contains A1,…, An then X can be written without ambiguity as A1∧…∧An: the bracketing inside X makes no difference to its truth-value. This is unlike the situation with →, for example, where A→(B→C) is not equivalent to (A→B)→C. By a similar use of the Induction Principle, we can show that a truth-functional compound X obtained by disjoining the sentences A1,…, An in any order is false just when all of A1,…, An are false, and we can therefore write X without ambiguity as A1∨A2∨…∨An. Exercises 1 Write out in full the inductive proof that if X is in L[A1,…, An; ∨] then X is false just when all the Ai in X are false. 2 Show that if X1,…, Xn and Y are truth-functional compounds, then Y is a truthfunctional consequence of the set {X1,…, Xn} if and only if Y is a truth-functional consequence of the sentence X1∧…∧Xn. 6 THE DISJUNCTIVE NORMAL FORM THEOREM Consider the following list of truth-functional equivalences (if they are not already familiar, commit them to memory now):

Historical note The equivalences X∧Y ⇔ ¬(¬X∨¬Y) and X∨Y ⇔ ¬(¬X∧¬Y) are traditionally called de Morgan’s Laws, after the nineteenth-century mathematician Augustus de Morgan. This list of equivalences shows that in principle we could make do with only negation and one other of the connectives in the set {∧, ∨, ¬, →, ↔}, for each of the remainder is definable in terms of those two (a connective C is definable in terms of others in a set Π just in case for every sentence in L[A, B, C…; C] there is an equivalent one in L[A, B,

Logic with trees

40

C…; Π]). In practice, it is convenient to have more than the bare minimum: for example, it is not immediately obvious that ¬(X→¬Y) is equivalent to X∧Y, and so while we could make do with ¬ and → to formulate conjunctions, it is simpler and clearer to add ∧ to the stock of basic vocabulary. Having too much basic vocabulary can also be inefficient. One would not in practice want to introduce a separate four-place truth-functional operator ç, where ç(A, B, C, D) is defined to be true when A and C are both true, true when A and B are false and D true, and false in all other cases: we simply do not have enough occasion to make this particular type of assertion to warrant giving it a special name. It is nevertheless nice to know that should such occasion actually arise, we should not be lost for words as long as we have ∧, ∨ and ¬. For ç(A, B, C, D) can easily be shown to be truth-functionally equivalent to a sentence in L[A, B, C, D; ∧, ∨, ¬]. Indeed, by a simple procedure we shall describe shortly, any n-place truth-functional operator ∂(A1,…, An) can be shown to be truthfunctionally equivalent to some sentence in L[A1,…An; ∧, ∨, ¬]. We shall first illustrate how the procedure works with ç(A, B, C, D). ç(A, B, C, D) was defined to be true when A and C are true, true when A and B are false and D true, and false in all other cases. So ç(A, B, C, D) is true for the following six rows, and only those rows, of its truth table:

The assertion that ç(A, B, C, D) is true is therefore equivalent to the sixfold disjunction: (A is T and B is T and C is T and D is T) or (A is T and B is F and C is T and D is T) or (A is T and B is T and C is T and D is F) or (A is T and B is F and C is T and D is F) or (A is F and B is F and C is T and D is T) or (A is F and B is F and C is F and D is T). We know that for any sentence X, X is false if and only if ¬X is true, i.e. X is F if and only if ¬X is T. So, for example, we can rewrite the second disjunct as ‘A is T and ¬B is T and C is T and D is T’, which is itself equivalent to ‘(A∧¬B∧C∧D) is T.’ We also know that for any sentences X and Y, X is T or Y is T if and only if X∨Y is T. From these facts we infer that ç(A, B, C, D) has the same truth table as the disjunction (A∧B∧C∧D)∨(A∧¬B∧C∧D)∨(A∧B∧C∧¬D)∨(A∧¬B∧C∧¬D)∨ (¬A∧¬B∧C∧D)∨ (¬A∧¬B∧¬C∧D)

Propositional languages

41

and hence is truth-functionally equivalent to it. We can easily generalise this procedure. Suppose X is a sentence in a propositional language whose sentence letters are A, B, C,…For each row of X’s truth table, write out a corresponding conjunction ±A∧±B∧±C∧ …, where ±A is defined to be A if A takes the value T at that row, and is ¬A if A takes the value F at that row; similarly for ±B, ±C, etc. (the alphabetical ordering of A, B, C, etc. in the conjunctions is quite arbitrary; any other could be chosen instead). Now form the disjunction of all these conjunctions which correspond to T rows of X’s truth table. This disjunction is a sentence in L[A, B, C,…; ∧, ∨, ¬], which by the reasoning above is truth-functionally equivalent to X. This construction obviously presupposes that X takes the value T on at least one row of its truth table; if X doesn’t, i.e. if X is a contradiction, then X is equivalent to A∧¬A, which is, of course, also a sentence in L[A, B, C,…; ∧, ∨, ¬]. We have in effect proved the following theorem: Theorem 1 (Disjunctive Normal Form Theorem) Suppose X is a sentence in a propositional language L with n sentence letters, which we shall denote by A1,…, An. If X is not a contradiction, then it is truth-functionally equivalent to a disjunction of conjunctions of the form ±A1∧…∧±An, where+Ai=Ai,−Ai=¬Ai. Corollary Any truth-functional assertion is equivalent to one which uses just the connectives ∧, ∨ and ¬. Another way of stating the corollary is that every sentence in the language L in the theorem is truth-functionally equivalent to a sentence in L[A1,…, An; ∧, ∨, ¬]. This is one of the fundamental results of (meta)logic. It tells us that ∧, ∨, ¬ are the most one needs in terms of truth-functional operations to formulate any truth-functional assertion, whereby a truth-functional assertion is meant one which can be evaluated as true or false by means of a truth table. Nor is this set of connectives the smallest with this ‘universal’ property: we know that either disjunction or conjunction can be defined in terms of each other together with ¬; indeed, as we shall see in the next section, we can actually find a single binary connective with the same ‘universal’ property. A non-contradictory sentence X which is already a disjunction of conjunctions ±A1∧… ∧±An, all with the same number n of conjuncts, is said to be in Disjunctive Normal Form (DNF). When X is a sentence not in DNF, then by the theorem above it is equivalent to one that is. There will be as many of these DNF equivalents of X as there are sentences representable in this form in L. Recall that there will be infinitely many of these in general, allowing for differences in bracketing, the order in which the sentence letters occur, and possible repetitions of sentence letters (see pp. 39–40). We could nevertheless select one of these to be the canonical representative, call it DNF(X), as the disjunctive normal form of any sentence X in L, if we wished. One way might be as follows. Order the sentence letters in X, say alphabetically, and take as DNF(X) that disjunction in L whose disjuncts D1, D2, D3…are of the form ((±A∧±B)∧±C)∧…and then write the

Logic with trees

42

disjuncts similarly as (((D1∨D2)∨D3)∨…. Any other choice of canonical representative would in principle be just as good, however, and we shall simply write DNF(X) as a multiple disjunction D1∨D2∨D3…, without internal pairwise bracketing, of multiple conjunctions ±A±B±C…, also without internal pairwise bracketing. Example Let X be the sentence C→¬(B∨A). To find DNF(X), first construct the truth table for X:

X is true at rows 2, 5, 6, 7 and 8. These rows will determine DNF(X). Since the letters in the conjuncts of DNF(X) will appear in the order A, B, C, we have DNF(X)=(A∧B∧¬C) ∨ (¬A∧¬B∧C) ∨ (¬A∧B∧¬C) ∨ (A∧¬B∧¬C) ∨ (¬A∧¬B∧¬C). Exercises 1 If X contains k sentence letters, what is the maximum number of disjuncts DNF(X) can possess? 2 Find the Disjunctive Normal form of each sentence below. (i) A→B (ii) A∨B (iii) A∧B (iv) A↔B (v) A¬A∨B (vi) A∧¬(B∨¬(C→¬A)) (vii) ¬(B∨A)

7 ADEQUATE SETS OF CONNECTIVES A set of connectives in terms of which all truth-functions can be defined is said to be adequate for truth-functional logic, or just adequate, for short. The Disjunctive Normal Form Theorem tells us that the set {∧, ∨, ¬} is adequate. So, of course, is any set which includes these connectives. Also, in view of the fact that ∨ (respectively ∧) is definable, by de Morgan’s Laws, in terms of ¬ and ∧ (respectively ∨), we infer that both sets {¬, ∧}

Propositional languages

43

and {¬, ∨} are adequate. Because X∧Y ⇔ ¬(X→¬Y), and X∨Y ⇔ ¬X→Y, we know that {→, ¬} is also adequate. Surprisingly, we can find a binary connective which by itself turns out to be adequate. In fact, we can find two, symbolised | and ↓, whose truth tables are these:

| is sometimes called alternative denial, because A|B is true just when one at least of A and B is false, and ↓ is sometimes called joint denial, because A↓B is true just when both A and B are false. Other names for | and ↓ are ‘nand’ and ‘nor’ respectively (we shall see why shortly). It might not seem obvious, looking at these truth tables, that we can define negation in terms both of | and ↓. But, as can easily be checked with a truth table, if X is any sentence

We can also see from the truth table for | that

which is why | is called ‘nand’, nand abbreviating ‘not-and’. Hence

and substituting A|B for X in (5) we get

In other words, we have shown that both ¬ and ∧ are definable in terms of |. Hence, since {¬, ∧} is an adequate set, | must also be adequate. Now for ↓. From its truth table we can see that

which is why ↓ is often called ‘nor’, i.e. ‘not-or’, or for that matter the ordinary English ‘nor’. Hence

Logic with trees

44

and so, by (5) again,

So both ¬ and ∨ are definable in terms of ↓; hence, since {¬, ∨} is an adequate set, ↓ is adequate. Exercises 1 Write down the (unsigned) conjugate tree diagrams for | and ↓. 2 Find a sentence X in L[A, B; |] such that A∨B ⇔ X, and a sentence Y in L[A, B; ↓] such that A∧B ⇔ Y. 3 Find sentences X in L[A, B; |] and Y in L[A, B; ↓] such that A→B ⇔ X and A→B ⇔ Y. 4 Which of the following are tautologies, contradictions and neither? (i) (A|B)|(B|A) (ii) (A↓(A↓A))↓(A↓(A↓A)) (iii) (A|(A|A))|(A|(A|A)) (iv) (A|B)↓(A|B) 5 Find the DNFs of (i) (A|B)|(B|A) (ii) A↓(B↓C) 6 Explain why it makes no sense to write A|B|C or A↓B↓C. 7* Show by induction on immediate predecessors that if X is any sentence in L[A, B; ↔, ¬], then X has 0 or 2 or 4 Ts in its truth table (i.e. in the truth table with four rows determined by the four truth-value distributions over A and B). Explain why this shows that {↔, ¬} is not adequate. Hint: think of conjunction and disjunction.

8* THE DUALITY PRINCIPLE An interesting and elegant result that can be easily proved using the Principle of Induction on Immediate Predecessors is the Duality Principle: Theorem 2 (Duality Principle) Let X be any sentence in L[A1,…An; ∧, ∨, ¬]. Let X* be obtained from X by replacing every occurrence of ∧ in X by ∨, every occurrence of ∨ by ∧, and every occurrence of A1 by ¬Ai Then X⇔¬X*. (X* is called the dual of X.) Proof A sentence X of L, where L is as in the theorem, will be said to have the property P if

Propositional languages

45

X⇔¬X*. We shall prove by induction on immediate predecessors that all sentences of L have P. So we have to establish that the following two conditions are satisfied: (1) each Ai has P; and (2) for any non-atomic X, from the inductive hypothesis that the immediate predecessors of X have P, it follows that X does also. (1) Each Ai clearly has no occurrence of ∨ or ∧, and so Ai* is just ¬Ai. So showing that Ai has P merely requires showing that Ai⇔¬¬Ai, which we know to be the case. (2) The induction step is an argument by cases. If X is not an Ai then X must have one of the following three forms: (i) X=Y∨Z, (ii) X=Y∧Z, or (iii) X=¬Y where Y and Z are sentences of L. If X is of the form (i) or (ii) it has as immediate predecessors Y and Z, while if it is of the form (iii) it has the one immediate predecessor Y. We shall check that the induction step holds in each of the cases. (i) Suppose that Y and Z each have P, i.e. that Y⇔¬Y* and Z⇔¬Z*. This supposition, recall, is the inductive hypothesis. From this we infer that Y∨Z ⇔ ¬Y*∨¬Z* (exercise 1 below). By de Morgan’s Laws ¬Y*∨¬Z* ¬ ⇔(Y*∧Z*). But Y*∧Z*=(Y∨Z)* (exercise 2 below), and Y∨Z=X. So we have shown that the inductive hypothesis implies that X⇔¬X*, i.e. X has P as required. (ii) We have the same inductive hypothesis as in (i). So again Y⇔¬Y* and Z⇔¬Z*. Hence Y∧Z ⇔ ¬Y*∧¬Z*. By de Morgan again, ¬Y*∧¬Z* ⇔ ¬(Y*∨Z*). But Y*∨Z*= (Y∧Z)*=X*. So X⇔¬X* in this case too. (iii) Here the inductive hypothesis is simply that Y⇔¬Y*. Hence ¬Y⇔¬¬Y*. But ¬Y*=(¬Y)*=X*. Hence X⇔¬X*. Q.E.D. Exercises 1 Show that if X⇔Y and V⇔W then (a) X∧V⇔Y∧W (b) X∨V⇔Y∨W 2 For any sentences Y, Z in the language L[A1,…, An; A, ∨, ¬], explain why (a) ¬Y*=(¬Y)* (b) Y*∧Z*=(Y∨Z)* (c) Y*∨Z*=(Y∧Z)* (Imagine you are a computer programmed with the instructions for converting a sentence X in L into its dual. The program performs its function by working from left to right through X, examining every symbol, changing it appropriately or leaving it unaltered.) 3 Show that if X⇔Y then X*⇔Y*

9* CONJUNCTIVE NORMAL FORMS An important application of the Duality Principle is in finding what are called the Conjunctive Normal Forms of truth-functional compounds. Define a literal to be a sentence letter or the negation of one. DNF(X), where X is any non-contradictory

Logic with trees

46

sentence, is a disjunction of conjunctions of literals. Now it is easy to show from the Disjunctive Normal Form Theorem and the Duality Principle that X is also equivalent to a conjunction of disjunctions of literals. Since for any sentence X in any language L, X ⇔ DNF(X), we must have ¬X ⇔ DNF(¬X). DNF(¬X) is of course a sentence whose connectives are only ∧, ∨ and ¬, and so by Duality, DNF(¬X) ⇔ ¬(DNF(¬X))*, where as before (DNF(¬X))* is the dual of DNF(¬X). Hence we have ¬X ⇔ ¬(DNF(¬X))*, and so X ⇔ (DNF(¬X))*. If we now eliminate any double negation ¬¬ that might appear in (DNF(¬X))* we obtain a conjunction of disjunctions of sentence letters and their negations, and this conjunction is called the Conjunctive Normal Form CNF(X) of X. For example, suppose X is the compound ¬(A∧(B→C)). As a first step, write out DNF(¬X). This is (A∧B∧C) ( (A∧¬B∧¬C) ∨ (A∧¬B∧C) (as in 6, we are assuming that the ordering of sentence letters relative to which the DNF’s are defined is alphabetical). Dualising, and eliminating double negations, we get CNF(X)= (¬A∨¬B∨¬C) ∧ (¬A∨B∨C) ∧ (¬A∨B∨¬C). To sum up: to find the CNF of a compound X you (i) find DNF(¬X), (ii) obtain its dual (DNF(¬X))*, and (iii) eliminate all double negations from (DNF(¬X))*; the result is CNF (X). If X is a tautology then its negation is a contradiction, and the DNF of ¬X is, according to the convention agreed earlier, A∧¬ A. By steps (ii) and (iii) above, CNF(X) =¬A∨A. Another example Let X be (C→¬(B∨A))∧¬(C∧A). DNF(¬X)=(A∧B∧C) ∨ (¬A∧B∧C) ∨ (A∧¬B∧C). Hence CNF(X)=(¬A∨¬B∨¬C) ∧ (A¬B∨¬C) ∧ (¬A∨B∨¬C). Conjunctive Normal Forms are very important in logic programming, where the inferential technique of resolution works on sentences cast into this form. Exercises Write out CNF(X) for each sentence X below: 1 A→B 2 A∧B 3 A→(B∨C) 4 A→(B→A)

Chapter 4 Soundness and completeness 1 THE STANDARD PROPOSITIONAL LANGUAGE We have already had some experience, in Chapter 2, of constructing truth trees. In this chapter we shall establish the procedure on a systematic basis and give rigorous proofs of their fundamental properties, namely that if all the members of a finite set of truthfunctional sentences are satisfiable, i.e. true for some distribution of truth-values over their sentence letters, then they generate a finite open tree, while if they are not satisfiable they generate a closed one. These results establish a relationship between a syntactic property of a finite set Σ of sentences, its generating a closed or open tree (this is a syntactic property because, as we shall see, rules can be given for tree construction of a purely formal character), and a semantic one, satisfiability. To prove them we have to state more precisely what form the rules of tree construction take. To begin with, we need to confine the discussion to a specific propositional language. Because our aim is a model of deductive reasoning of as great a generality as can be achieved, this propositional language should be universal, in the sense that any truth-functional inference can be represented in it. We know from the Disjunctive Normal Form Theorem that the truth-functional structure of all the sentences appearing as premises and conclusion in any inference can be translated into sentences of a propositional language L[S; Π] in which Π is the set {∧, ∨, ¬). So any propositional language with those connectives and in which there is an unlimited supply of sentence letters will be universal in the sense we require. To give ourselves something a bit more like the luxury of ordinary language (which is what, after all, we are modelling), we shall add → to the set {∧, ∨, ¬}. What we shall now do is select an arbitrary one of these universal languages, which differ only in their sentence letters, and call it the standard propositional language; in the rest of this chapter it will be referred to as L[A, B, C,…; ∧, ∨, ¬, →] or just L. ↔ is not one of L’s connectives; for technical reasons which will become apparent it is convenient to do without it. However, it is of course there implicitly, and we shall continue to write X↔Y in appropriate circumstances, regarding this as merely shorthand for one of its truth-functional equivalents in L, like (X→Y)∧(Y→X), for example.

2 TRUTH TREES AGAIN We can ‘collapse’ all the conjugate diagrams for ∧, ∨, → into just two in the following way. First, we rearrange them into two groups as follows:

Logic with trees

48

(i)

(ii) We shall classify the upper sentences in (i) and (ii) as a and β sentences as follows (with a minor variation this is Smullyan’s (1968) classification): An α sentence of L is of the form either X∧Y, ¬(X∨Y) or ¬(X→Y), where X and Y are sentences of L. For each of these three types of a sentence we define a corresponding pair of sentences α1, α2:

A β sentence of L is of the form X∨Y, X→Y, or ¬(X∧Y), where X and Y are sentences of L. For each of these three types of β sentence we define corresponding pairs of sentences β1, β2:

The two groups of diagrams above can now be represented by just the two diagrams:

These diagrams will henceforth be known as the tree rules (α) and (β) respectively; the pairs of sentences α1, α2 and β1, β2 will be called the descendants of a and β respectively under the rules. Together with the Rule of Double Negation

Soundness and completeness

49

these will be all the rules we shall employ in constructing truth-functional truth trees. A simple result which we shall find useful and which can be left to the reader to check is the following: Theorem 1 If α is any α sentence and β is any β sentence then α⇔α1∧α2 and β⇔β1∨β2. Theorem 1 explains why α sentences are sometimes called conjunctive sentences, and β sentences disjunctive sentences. Another useful result we shall need is this: every sentence in L is either a literal, or a sentence commencing with at least one pair of consecutive occurrences of a negation symbol, or an a sentence, or a β sentence. It’s hardly necessary to dignify this with the title ‘theorem’, and we shall leave its demonstration an exercise (you just need to note that every sentence in L is either a sentence letter or else is a sentence of the form ¬X, X∧Y, X∨Y or X→Y, with outer brackets omitted as usual). Before proceeding to a formal statement of the rules of tree construction, we shall briefly review some general tree-concepts. The nodes (see Chapter 3, section 3) are the end-points and junction-points of linesegments. The topmost node is the root of the tree. The nodes on an ancestral tree are all single sentences; those on a truth tree, by contrast, may also be constituted by sets of sentences: the pair {α1, α2}, appearing on a tree without the set brackets {} and with α1 conventionally written above α2, is a single node, as is the set of initial sentences which will, unless otherwise indicated, form the root of the tree. A branch in a tree is the sequence of nodes in any continuous path along linesegments upwards from the lowest node to the root, including the end-points. We can regard a single node as a degenerate branch. Of course, we have not yet formally defined a tree. So far we have relied on the following informal characterisation. A finished tree generated by a finite set Σ of initial sentences is the entity generated by applying a tree rule to an unused sentence in Σ (if there is one; if there isn’t, Σ itself is the finished tree generated by Σ), and continuing to apply the appropriate rule to every unused sentence on every branch generated, closing a branch as soon as a sentence and its negation appear on it. When either every branch has closed or there are no further sentences that can be used, the tree is finished. A defect of this definition, apart from its informality, is that it fails to highlight one of the most important features of truth trees, namely the fact that they can be constructed in a completely mechanical way. Another way of putting this is to say that their construction can be made entirely algorithmic, or programmable on a suitably idealised computer. Here is a ‘proto-program’ for constructing a tree from Σ: Start. Write the members of Σ in a column. Check that no sentence and its negation are in Σ. If there is a sentence and its negation in Σ, draw a line under Σ, and stop: the tree consisting of just Σ is said to be finished and closed. If there is no sentence and its negation in Σ, check that there is at least one non-literal in Σ (as in Chapter 2, section 9*, a literal is any sentence letter or negation of a sentence letter). If there is not, stop: the tree consisting just of Σ is finished and open. If neither of these eventualities is the case, select the topmost non-literal X in Σ. We know that X is either a sentence commencing with more than one negation symbol, or else is an a or β sentence. We deal with these

Logic with trees

50

cases in turn. (i) If X is a sentence commencing with more than one ¬, we can write it ¬¬Y; now apply the double negation rule to X, i.e. write

directly below Σ. X has now been used. (ii) If X is an a sentence, apply the (α) rule to X; i.e. write

directly below Σ, where α1, α2 are the descendants of X under that rule. X has now been used. (iii) If X is a β sentence, apply the (β) rule to X; i.e. write

below Σ, where β1, β2 are the corresponding descendants of X under that rule. X has now been used. Now repeat the following sequence of instructions, enclosed within < and > brackets, in the order they appear, until no further repetition of the sequence is possible. When that happens, stop. The result is a finished tree generated from Σ. We called this set of instructions a proto-program because while it is not a computer program, it can be turned into one. When started up it will alway halt after finitely many repeats, whatever the set of initial sentences so long as that set is finite. This is because each of the rules (α) and (β) applied to a sentence eliminates a binary connective, while the Rule of Double Negation ensures that at some stage in the evolution of the tree every sentence on it commencing with a consecutive sequence of ¬’s will be reduced either to a literal, or else to an a or β sentence. As there are only finitely many occurrences of a connective in any sentence, it follows that after finitely many stages of the program all the sentences in Σ will have been decomposed into literals, or else the tree will have closed at some earlier point. As we pointed out in Chapter 2, neither the numbering of lines on the tree nor the

Soundness and completeness

51

ticking of sentences on it as they are used is integral to the tree: they merely help the treegrower keep stock of what they are doing. Henceforward we shall number lines where it is is obviously helpful, though instead of ticking sentences as they are used we shall state at each line, where it is not obvious, which line number was used to obtain that line. Exercises 1 For each of the following sentences, say whether it is an α sentence, a β sentence, a literal, or none of these. (i) A→(B→C) (ii) ¬¬A (iii) (A∧B)∨¬(A∧¬C) (iv) ¬B (v) ¬(B→(¬C∨¬¬D)) (vi) ((C∨¬D)∨¬(D∧B))∧¬(A∨¬A) 2 Construct finished trees for the following sets of sentences, and say whether the trees are open or closed. (i) {A→B, B→C, ¬C, A} (ii) {A→B, B→C, ¬C, ¬A) (iii) {A∨B, ¬B∨C, ¬A, ¬C} (iv) {A∨B, ¬B∨C, ¬A, C} 3 Show that every sentence in L is either a literal, or a sentence whose initial symbols are a block of more than one negation symbol, or else is an α or β sentence.

3 TRUTH-FUNCTIONAL CONSISTENCY, TRUTH-FUNCTIONALLY VALID INFERENCES, AND TREES Suppose we have generated a finished tree from some finite set Σ of initial sentences. There are two possibilities: (i) the tree is closed, and (ii) it is open. We have already been told how to interpret open and closed trees in terms of the satisfiability or not respectively of their sets of initial sentences. This interpretation is justified by a pair of metatheorems which will be proved shortly, the Soundness and Completeness Theorems for truthfunctional trees, whose names we are already familiar with but whose precise statement will now be useful: Soundness Theorem If τ is a truth-value distribution over sentence letters which makes all the members of Σ true, then every finished tree generated by Σ is open, and all the sentences on each open branch are made true by τ.

Logic with trees

52

Completeness Theorem If Σ generates a finished open tree, then there is a distribution of truth-values to sentence letters which makes all the members of Σ true; indeed, any truth-value assignment which makes all the literals on each open branch true is one. Hence if (i) is the case, the Soundness Theorem tells us that there is no truth-value distribution over the sentence letters appearing in Σ which satisfies Σ, i.e. which makes all the sentences in Σ true (this is simply the contrapositive form of the Soundness Theorem). If (ii) is the case, the Completeness Theorem tell us that assigning the value T to each literal on any open branch, and any values whatever to the remaining sentence letters in Σ gives a truth-value distribution over the sentence letters in Σ which satisfies Σ. We shall say that a set of sentences in any propositional language is truth-functionally consistent if there is a distribution of truth-values over its sentence letters which makes all its sentences true (this is the property referred to earlier as satisfiability). If there is not, then it is truth-functionally inconsistent. The Soundness Theorem and Completeness Theorem tell us that a finite set of sentences in the standard propositional language Is truth-functionally consistent if and only if it generates a finished open tree. We now list some further consequences of these theorems. (a) Whether a tree generated from a finite set Σ closes or not does not depend on the order in which you use the unused sentences which appear at any stage in the development of the tree; you will usually get different trees, but either all those trees close or all remain open, depending only on whether Σ is truth-functionally consistent or not. If Σ is truth-functionally consistent, then from Soundness it follows that any finished tree generated from Σ has an open branch. The Completeness Theorem implies that if any tree generated from Σ has an open branch, then Σ is truth-functionally consistent; hence if Σ is truth-functionally inconsistent any tree generated from Σ closes. (b) If an inference is truth-functionally valid then the set Σ consisting of its premises and the negation of its conclusion is truth-functionally inconsistent (to show this is left as exercise 1 below), and the Completeness Theorem implies that any tree generated by Σ will close. We shall say that there is a tree proof of the conclusion of an inference from its premises if a closed tree can be generated from its premises and the negation of its conclusion. Thus the Completeness Theorem implies that if an inference is truthfunctionally valid then there is a tree proof of its conclusion from its premises (this is the sense of ‘completeness’ to which the name of the theorem refers). (c) If an inference is truth-functionally invalid then the set Σ consisting of its premises and the negation of its conclusion is truth-functionally consistent. Hence by the Soundness Theorem any tree generated by Σ will remain open, and by the Completeness Theorem every truth-value distribution over sentence letters obtained by assigning all the literals the value T on each open branch is a counterexample to the inference. (d) If the set Σ consisting of the premise and negation of conclusion of an inference generates a finished open tree, then every counterexample to the inference can be obtained as some open branch. For suppose τ is a distribution of truth-values to the sentence letters of the inference which makes all the sentences in Σ true. Then there is an open branch in any finished tree generated by Σ on which all the sentences, and hence all the literals, are true under τ.

Soundness and completeness

53

Now for some examples. We shall start by showing that the inference in exercise 3(ii), section 8, Chapter 1, is truth-functionally valid. The inference, formalised

has a tree proof as follows (opposite):

Now look at the inference of exercise 3(i), section 8, Chapter 1. Formalised in the standard propositional language, it becomes

The finished open tree below shows that it is truth-functionally invalid. The justification for each line has been omitted; as an exercise the reader should supply it.

Logic with trees

54

The tree is finished, with one open branch. The Completeness Theorem tells us that if the value T is asigned to each of the literals on this branch, the initial sentences are themselves assigned the value T. In this way, the branch determines a counterexample to the inference. In fact, it determines two, since it specifies the truth-values only on A, B, C and E, namely A–F, B–F, C–T, E–F, but it does not determine that of D. Hence the open branch determines the two distributions A–F, B–F, C–T, D–T, E–F and A–F, B–F, C–T, D–F, E–F, both of which are counterexamples to the inference. Below are some simple truth-functionally valid inferences, known for so long that they have acquired classic status (and names, in the case of the first two). We have already made the acquaintance of some of them. Though not essential, it is useful to remember them. Modus ponens: Modus tollens: Hypothetical syllogism: Disjunctive syllogism: Importation: Contraposition:

Exercises 1 Construct tree proofs for hypothetical syllogism and importation. 2 Give tree proofs of the truth-functional validity of the following (i) A, ¬A ∴B (ii) B ∴A∨¬A 3 State which of the following inferences are truth-functionally valid, and justify your statements by constructing appropriate trees, using the sentence letters indicated. For

Soundness and completeness

55

any which is not valid, list the counter example to it. (a) Tom will go to the show (A) only if Amanda will go (B), but Amanda won’t go unless Henry will go (C). If Henry won’t go, therefore, neither will Tom. (b) A person is entitled to benefit (A) only if either they are unemployed (B), or they are over 60 (C) and they have a disposable income of less than £10, 000 per year (D). Therefore, if they have an income of less than £10, 000 per year and are over 60 they are entitled to benefit. (c) If the Ukraine secedes from the treaty (A), and allies itself with Poland (B), then Georgia will ally itself with Russia (C). Georgia won’t ally itself with the Baltic republics (¬D) if the latter support economic decentralisation (E), and if Georgia allies itself with Russia, then the Baltic republics will support economic decentralisation or ask for help elsewhere (F). Therefore if the Ukraine secedes from the treaty and Georgia allies itself with the Baltic republics, then either the Baltic republics will ask for help else where or the Ukraine won’t ally itself with Poland. 4 From the open trees generated by each of the following sentences, identify the distributions of truth-values over the sentences letters which make the sentences true. (a) B→¬(A∧¬C) (b) ¬(D∧(C→(D∨C))

4* SOUNDNESS AND COMPLETENESS THEOREMS We shall now prove the Soundness and Completeness Theorems for truth-functional trees. The Soundness Theorem requires the following lemma (a lemma is merely a preliminary result): Lemma 1 Suppose that B is a branch on a tree. If all the sentences on B are true under some distribution of truth-values to their sentence letters, then B is open. Proof If B were closed it would contain a sentence and its negation; and both cannot be true together. Theorem 2 (Soundness Theorem for Propositional Truth Trees) Let Σ be a finite set of sentences in L. If Σ is satisfiable by a truth-value distribution τ over its sentence letters, then any finished tree generated by the rules of tree construction has an open branch, all the sentences on which take the value T under τ.

Logic with trees

56

Proof Suppose that a finished tree has been constructed from Σ and that there is some distribution τ of truth-values to the sentence letters in Σ which makes all the sentences in Σ true. We shall find an open branch in the tree. Let B be the branch-segment consisting just of Σ (B is therefore just a single node). By Lemma 1 B is open. There are two possibilities: Σ contains at least one non-literal, or it does not. If it does not, B is a finished open branch all of whose sentences take the value T. If B does contain a non-literal, then B has an extension B' in the tree generated by the application of one of the rules (α) or (β) or Double Negation to some non-literal on B, such that all the sentences on B' are T under τ. For suppose rule (α) was applied. The sentences on B' are those in Σ and α1 and α2. But by assumption all the sentences on B take the value T under τ. Hence by Theorem 1 both α1, and α2 will be true under τ, and so all the sentences on B' will take the value T. If rule (β) was applied, Theorem 1 tells us that one of β1 and β2 must also be T under τ. Suppose, without loss of generality, that it is β1. In this case let B' be that extension of B whose nodes are Σ and β1. Again, all the sentences on B' take the value T. Finally, if the Rule of Double Negation was applied to a sentence ¬¬X, let B' be the extension of B which includes X. Since by assumption ¬¬X takes the value T so does X, and so all the sentences on B' take the value T under τ. In each case, since B' contains only sentences true under τ, B' is open, by Lemma 1. If B' is not finished then it has an extension in the tree obtained by applying one of the tree rules to a sentence on B'. Proceeding as before, we construct another open branch B" which extends B'. Continuing in the same way we obtain a sequence of open branches B, B', B",…in the tree, each of which extends its predecessors. The sequence terminates at some finite stage, and when it does so we have a finished open branch in the tree, as required. Q.E.D. Note This proof is analogous to a proof by mathematical induction on the positive integers (Chapter 3, section 4). It can indeed be converted into one explicitly, though to do so hardly adds to its force. The proof of the Completeness Theorem requires another four easy lemmas. Lemma 2 Suppose B is an open branch in a finished tree and let X be any sentence on B. Then, if X is an a sentence, both α1 and α2 will appear on B below X. If X is a β sentence then either β1 or β2 will appear on B below X. If X is of the form ¬¬Y then Y will appear on B below X. Lemma 2 is an immediate consequence of the rules of tree construction. For the next three lemmas we require a couple of definitions. For each connective in the standard propositional language we define its weight as follows: the weight of ¬ is 1, and that of each of ∧, ∨ and → is 2. The degree of a sentence X in the standard propositional language is now defined to be the sum of the weights of each connective occurring in X, counting all repetitions as separate occurrences. So, for example, the degree of A is 0, of ¬¬¬A is 3, of B∨¬B is 3, of C→(¬C→(A∨¬¬B)) is 9, of ¬(¬A∨B) is 4, etc. The degree of X is a property of the formal structure of X itself; in particular, the fact

Soundness and completeness

57

that ¬¬A is truth-functionally equivalent to A does not mean that the degree of ¬¬A is 0; it is of course 2. Lemma 3 For any α sentence X in L, the degrees of α1 and α2 are each less than the degree of X, and if X is a β sentence, the degrees of β1 and β2 are each less than the degree of X. The proof of Lemma 3 consists simply in checking that the α and β rules always eliminate a binary connective. The next lemma introduces us to another type of inductive argument, sometimes called strong induction. Lemma 4 Let ∆ be any set of L sentences and k any integer. Suppose that (1) all the sentences of degree ≤ k in ∆ have some property P, and (2) where X is any sentence of degree > k in ∆, if all sentences in ∆ of lower degree have P so does X. Then all the sentences in ∆ have P. Proof of Lemma 4 Suppose that (1) and (2) in the statement of the lemma are satisfied. By (1) all sentences of degree ≤ k have P. Now let k' be the smallest number greater than k such that there are degree k' sentences in ∆. By (2) all these sentences have P. Proceeding in this way, through the ∆-sentences of next highest degree, and then the next highest after that, and so on, we shall eventually infer that a sentence of any given degree will have P. Since every sentence in ∆ has some degree, it follows that all the sentences in ∆ must have P. Q.E.D. This Induction Principle is superficially unlike the one we encountered in the previous chapter, in that instead of the induction step linking each element with a suitably defined immediate predecessor (or predecessors, if there is more than one), here the induction step (2) links each element with all those ‘predecessors’ determined according to the criterion of having lower degree. Lemma 5 The only sentences of degree 0 or 1 in L are the literals of L. The proof is very simple and is left to the reader. Theorem 3 (Completeness Theorem for Propositional Trees) Let Σ be a finite set of sentences of L. Every finished open branch in a tree generated from Σ determines a truth-value distribution over the sentence letters in Σ which satisfies all the sentences in Σ.

Logic with trees

58

Proof Suppose that a finished open tree has been generated from Σ, and let B be an open branch on the tree. We shall show by strong induction on degrees that all the sentences on B take the value T under some distribution τ of truth-values over the sentence letters of Σ. We do this by first of all supposing τ to be any distribution of truth-values to the sentence letters in Σ such that every literal on B takes the value T. There does indeed exist such a distribution, because B is open, by assumption, and so there is no pair C, ¬C of literals on B. Second, we identify the set ∆ in Lemma 4 with the set of all sentences on B, and finally we define the property P of sentences in that lemma as follows: a sentence X in ∆ has P just in case X is assigned the value T by τ. We shall now prove using Lemma 4 that all the sentences on B take the value T under τ. To establish the step (1) required by the lemma, note that there is at least one literal on B (since B is open and finished). Hence we infer that the lowest degree of sentences in ∆ is 0 or 1. But we know that all the sentences on B of degree 0 or 1 are literals (by Lemma 5), and are T under τ, by assumption, and so step (1) is established. Now for the induction step (2). Consider any sentence X on B of degree greater than 1. We shall suppose (inductive hypothesis) that every sentence on B of degree lower than that of X is true under τ, and show that from that assumption it follows that X is assigned T by τ. The reader should verify that X is either (i) an a sentence or (ii) a β sentence, or else is (iii) a multiply negated sentence (i.e. one commencing with at least two consecutive occurrences of ¬). We shall establish the induction step for each of the cases (i)–(iii). (i) If X is an α sentence, then by Lemma 1, α1 and α2 are also on B, and by Lemma 3 both are of degree less than X. So by the inductive hypothesis they are both T under τ. Hence by Theorem 1 so is X. (ii) If X is a β sentence then by Lemma 1 one of β1 β2 is on B, and by Lemma 3 both these are of degree less than X. By the inductive hypothesis, therefore, X is T under τ, and again by Theorem 1 so is X. (iii) If X is multiply negated it has the form ¬¬Y, and at some point, since B is finished, the Rule of Double Negation was applied to X. Hence Y is on B. But Y has smaller degree than X, and so by the inductive hypothesis Y is T under τ. Hence ¬¬Y is T, i.e. so is X. In each of the three possible cases, therefore, X is T under τ. The induction step (2) is now complete, and so by Lemma 4 every sentence on B is T under τ. In particular, all the members of Σ are T under τ, since they are all on B. Q.E.D. Why are Soundness and Completeness so-called? It has already been explained that the Completeness Theorem is so called because it implies that every truth-functional consequence of a set of premises can be shown to be a consequence by a tree proof. The Soundness Theorem is so called because it implies that if there is a tree proof of a sentence X from a set of premises Σ, then X is a truthfunctional consequence of Σ; the tree proof, in other words, cannot prove something to be a consequence without its really being one (a theory of formal proof with this property is traditionally called sound).

Part II First-order logic

Chapter 5 Introduction 1 SOME NON-TRUTH-FUNCTIONAL INFERENCES Consider the following inference (it is a type known as an Aristotelian syllogism): (S) All Cretans are liars. All liars are wicked. ∴ All Cretans are wicked. Historical note ‘All Cretans are liars’ was a famous, some would say infamous, remark uttered, according to St Paul’s Epistle to Titus, by Epimenides the Cretan; its self-refuting nature inspired a debate about the nature of truth which continues to this day and whose recent progress we review in Chapter 11. Everyone from Aristotle onwards has taken (S) to be a paradigmatic example of a deductively valid inference. However, it is not truth-functionally valid: none of the three sentences making up the inference is a truth-functional compound of anything but itself, so within a propositional language each would have to be represented by distinct sentence letters, say A, B and C respectively. (S), in other words, has the truth-functional form A B ∴C which is truth-functionally invalid, since we have only to make the assignment T to A, T to B and F to C to get a trivial truth-functional counterexample. Of course, these can’t be real truth-values if (S) is valid, since then it would be impossible for the two premises to be true and the conclusion false. A–T, B–T and C–F is, however, a consistent distribution of Ts and Fs over the sentence letters A, B and C, indicating that if (S) is valid, then it is not valid as a function of its truth-functional structure alone. And (S) certainly is valid. A simple pictorial method of demonstrating its validity, and also that of the other valid Aristotelian syllogisms, was discovered by the German mathematician Euler in the eighteenth century and refined by the English mathematician John Venn in the nineteenth. Euler’s method is very well known (Venn’s refinement rather less so; for a brief account of it see Kneale and Kneale 1962:421). First replace the specific class terms ‘Cretan’, ‘liar’ and ‘wicked (person)’ by non-specific, schematic ones, say P, Q and T. P, Q and T are now represented by circles drawn inside a rectangular box D representing a universe of discourse. To say that some Ps in D are Qs means that there are things in D in the intersection of P and Q, signified by placing an asterisk in that intersection (see Figure 1 (a)); to say that no Ps are Qs means that the

Logic with trees

62

circles do not intersect (see Figure 2 (b)) and to say that all Ps are Qs means that the circle P is either wholly contained within, or coincides with, the circle Q (see Figure (c); the fact there is no asterisk in that part of the interior of Q which is not included in that of P leaves it open whether P coincides with Q or whether there are Qs which are not Ps). In an Euler diagram which makes the premises of the syllogism (S) true (whatever classes of thing P, Q and T denote), the circle P lies inside the circle Q which lies within the circle T. Hence the circle P must lie inside the circle T; i.e. all Ps are Ts and hence the conclusion of (S) must be true if the premises are true. As an exercise, construct an Euler diagram which will similarly demonstrate the validity of the syllogism All Ps are Qs. Some Ps are Ts. ∴ Some Qs are Ts. While the method of Euler diagrams is fine for evaluating the relatively restricted class of syllogisms, it is quite inadequate for dealing with inferences in which the information is not about simple class inclusions, intersections, complements, etc. Consider, for example, the following inference: (*) Some positive integer is less than or equal to every positive integer. Therefore, for every positive integer, there is one less than or equal to it. (*) is deductively valid, but it cannot be shown to be by an Euler diagram (try it!). We need a more powerful method and in this and the following chapters we shall develop one, called first-order logic. As a first step we need to introduce some new notation, which will take us beyond the propositional languages of Chapter 3 to a class of formal languages, called first-order languages, capable of exhibiting much more of the logical structure of sentences than is possible within propositional languages. These more elaborate languages will still

Introduction

63

Figure 1

include the connectives →, ∧, ∨ and ¬, but to these will be added two other logical operators called the universal and existential quantifiers and a stock of extralogical symbols called predicate and relation symbols, variables and constants. We shall proceed as we did with the propositional languages, by first informally describing the syntax and

Logic with trees

64

semantics of these extended languages, then giving a more precise formal characterisation and finally proving Completeness and Soundness Theorems for an augmented set of tree rules. 2 QUANTIFIERS AND VARIABLES The universal quantifier The premises and conclusion of the syllogism (S) are called universal generalisations. This extensive and important class of statements are assertions to the effect that everything in a domain D of discourse satisfies some condition or other. ‘Every’ and its variant ‘all’ are collectively known as the universal quantifier, and so important is it in modern logic that it has its own special symbol, . But never occurs alone like that when it is used to make an assertion; it is always immediately followed by what is called an individual variable, or simply variable, represented by a lower-case letter drawn from the end of the Roman alphabet, usually x, y or z. So the universal quantifer will always appear in the formalised version of an ordinary-language sentence in the composite form x (or y or z). A definite assertion is made by combining x with a condition on x, which we can represent formally by P(x), thus: xP(x). This is to be read ‘Every individual x in the domain D satisfies the condition P(x)’, or, more simply, ‘For every x in D, P(x) is true’ (note that there is no explicit reference to D in xP(x), just as there is often no explicit reference to the domain of discourse in ordinary speech). xP(x) is called a formula of that language (more precisely, closed formula, but ‘formula’ will do for now) and the occurrence of the variable x following the quantifier x in xP(x) is said to be bound by that quantifier. It is easy to see that xP(x) is true or false just when ‘Everything in D satisfies P’ is xP(x) as representing the logical form of true or false and we shall regard the formula ‘Everything in D satisfies P.’ However, no variable appears explicitly in the English sentence ‘Everything in D satisfies P’; in that case, why use a variable in its formalisation? Why not simply write P, for example? The answer is that there are more complex statements than ‘Everything in D satisfies P’, for which it is at the very least useful, and may be indispensable, to employ variables and possibly more than one, to display clearly their logical structure. For example, try to paraphrase without using variables the statement that for any numbers x, y and z, x.(y+z)=x.y+x.z. You can do so, but not nearly so intelligibly and simply as if you use variables like x, y and z explicitly—which is, of course, why they were introduced into mathematics in the first place (this occurred in the seventeenth century and was one of the preconditions for the explosion of activity in the new mathematical sciences in that century). The great insight of the logical pioneers of the late nineteenth century was that what works so well in mathematics can work equally well in the representation of logical structure itself. etc. all assert that every individual in D satisfies P. Since this does not depend on D or P we can say that in all interpretations they are all true or all

Introduction

65

false. We shall express this by saying that they are all logically equivalent sentences and we shall use the same symbol, ⇔, that we used for truth-functional equivalence to express this fact. The justification for using the same symbol is that two truth-functionally equivalent sentences are clearly logically equivalent, so that truth-functional equivalence is just a subspecies of this more extensive notion. The existential quantifier There is not just one but two types of quantifier in first-order logic, the second being the existential quantifier, symbolised ∃. Like the universal quantifier, it cannot exist alone in sentences but must always be accompanied by a variable to form a composite symbol . is read as saying ‘there is at least one individual x in the domain D for which P(x) is true’. Similar considerations apply here as to the universal quantifier.

is just another way of saying that something in D satisfies the

condition P and so same assertion.

are logically equivalent formalisations of that

Quantifier interdependence There is a very important relationship between the universal and existential quantifiers: either can be expressed in terms of the other and negation. For is true if and only if every individual in D has the property P, i.e. if and only if there is no individual in D which does not have P; but this is exactly what says. Since these biconditionals also hold for any D and any property P, we can infer that is logically equivalent to i.e. . A similar argument to that above, which will be left to the reader as exercise 2 below, shows that . Exercises 1 Suppose the domain is that of human beings, that P(x) says that x is tall and that Q(x) says that x is broad. State in words and without mentioning the variables x and y, what each of the following says: (i) (ii) (iii) (iv) Do

and

2 Explain carefully why

say anything different from (i) and (ii) respectively?

Logic with trees

66

3 RELATIONS A relation is a state of affairs that may or may not hold between individuals. ‘x is less than y’ is a binary, or two-place, relation between numbers; ‘x is the godmother of y’ and ‘x is a sister of y’ are binary relations between people. ‘x is between y and z’ is a threeplace relation between individuals which may be numbers, or people on a seat, or times, or places, while ‘x = (y+z)/w’ is a four-place relation between numbers. Relations of more than four places might seem very arcane objects, not the sorts of things that would crop up much in practical discourse. In fact, they’re commoner than might be thought, especially in the mathematical sciences, where so-called functional relations are described which can hold between enormous numbers of individuals (for example molecules of a gas). In first-order logic, expressions of the form R(x1, x2,…xn) symbolise n-place relations (since n is mentioned explicitly it is convenient to employ numerically subscripted variables here instead of x, y, z,…). In that expression R is called an n-place relation symbol. One-place relations are not what is normally understood by relations at all, but properties or predicates of individuals. These will be symbolised by expressions of the form P(x), Q(x), etc. P, Q, etc. are called predicate symbols. ‘Is green’, ‘is a prime number’, ‘is a nuclear reactor’, etc. are ordinary-language predicate terms. It is important to grasp that R(x, y, z,…) signifies that x, y, z in that order stand in the relation R. It may well be that if x and y in that order stand in a binary relation R then so do y and x. But it may be the case that for some pair of individuals x and y, if x and y in that order stand in R, then y and x definitely do not. For example, if R(x, y) represents the binary relation ‘x is less than y’ in the set N of natural numbers and R(x, y) is true for any pair of values of x and y in N, then R(y, x) is false for those values. It would be impossible to convey this information if the symbolism R(x, y) did not implicitly impose an order on x and y in the way they satisfy R. This notational convention does not, however, prevent us from saying that x and y may stand in some binary relation R (for example the identity relation =) independently of the order in which they are written, for we can express this fact by means of the formula R(x, y)→R(y, x). Let us pause here and look again at the inference (*) (above, p. 64). The premise and conclusion are both true statements about numbers, but at first sight they seem to be logically unrelated true statements. In fact, they are not logically unrelated at all, for (*) is a deductively valid inference: it is impossible for the premise of (*) to be true and the conclusion false. We shall prove this later, but to prepare the way it will be useful to discuss just what it means for it to be impossible for its premise to be true and the conclusion false. The clue lies in the observation that any demonstration of (*)’s validity should not depend on further unspecified information about the nature of the binary relation ‘less than or equal to’. Were it to do so then the truth of the premise, independently of that additional information, would not be sufficient to ensure the truth of the conclusion, which it does. Hence, (*) must remain valid, in the sense of the provisional definition in Chapter 1, if we replace ‘x is less than or equal to y’ (x≤y) by the symbolic representation R(x, y) of a generic binary relation. Nor should (*)’s deductive validity depend on any further unspecified information

Introduction

67

about the nature of natural numbers themselves, which implies that it is valid independently of the domain of the quantifiers. Putting these observations together, we can conclude that showing that (*) is deductively valid means showing that whatever set D is selected as the domain of the quantifiers in the formalisation below and whatever binary relation defined in D is selected as the interpretation of R in D, the premise of

is never true and the conclusion false. These observations go far to redeem the promise, made in Chapter 1, that eventually we would define in a non-circular way the all-important ‘cannot’ in the provisional definition given there of deductively valid inference (‘a valid deductive inference is one whose premises cannot all be true and conclusion false’). That provisional definition can now be updated as follows: an inference is deductively valid if there is no structure consisting of a domain and relations defined in that domain which interpret the relation symbols in the inference, such that in that structure the premises are true and the conclusion is false. Any structure consisting of a domain and relations defined in that domain which interprets a formalised inference we shall, naturally enough, call an interpretation of the inference (we shall elaborate this definition later, but it is good enough for now). An interpretation which makes the premises true and the conclusion false we shall, by analogy with the truth-functional case, call a counterexample to it. When we have added tree rules for the quantifiers to the truth-functional ones of Chapter 4 we shall be in a position to prove by means of a closed tree that there is no counterexample to (*) and the various other inferences cited in this chapter. In the meantime, we need to complete the formal apparatus introduced in this chapter by adding one more item to the formal vocabulary of first-order languages. Suppose we try to formalise the following inference: (**) If Mary is happy then everyone is happy. ∴ If Mary is happy then so is Manfred. In (**) two specific individuals are referred to, Mary and Manfred. We have already borrowed variables from mathematics and we shall now borrow again from it, this time constants, lower-case letters a, b, c,… from the beginning of the Roman alphabet, whose function is to refer to specific individuals in the domain. Using such constants a and b to stand for Mary and Manfred respectively and the predicate symbol M to replace the predicate ‘is happy’, the inference above can be formalised:

The introduction of constants completes the formal vocabulary into which we shall translate, or formalise, ordinary-language sentences. In the remainder of this chapter we shall develop some general rules and strategies for doing this.

Logic with trees

68

Exercises 1 Suppose that the domain is the set of positive integers and that R(x, y) is now the relation ‘x is less than or equal to y’. Explain without mentioning the variables x and y what the following sentences say and whether they are true or not. (i) (ii) (iii) (iv) (v) (vi) 2 Which of (i)–(vi) remain true when R(x, y) is interpreted as ‘x is less than y’ on the same domain? 3 Explain why (i) If P(a) is true in a domain D, then (ii) If P(a) is true in a domain D, then

is true in D. is false in D.

4 FORMALISING ENGLISH SENTENCES How do we know when we have the right, or a right, first-order formalisation of a natural-language sentence? Practice helps, but the following rule is a good one to try: compare the conditions in which the formalised and unformalised sentences are each true, by using informal arguments to see what seems to follow from each and what seems to imply each. This may sound a bit vague and also question-begging given that formalising ordinary discourse is just what is supposed to aid us in seeing what does and does not follow from what. But we should not despair. We already have some logical knowledge and we can use that and the machinery we subsequently develop for cross-checking our guesses. Another good rule is to start with simple examples. The syllogism (S) at the beginning of this chapter is one such. To formalise (S), we have to formalise sentences of the form ‘All Ps are Qs’, where the domain D is not explicit. This at any rate seems straightforward, for another way of stating what is conveyed by ‘All Ps are Qs’ is by means of the universally quantified conditional ‘For any x in D, if x is a P then x is a Q’. Granted this, we can formalise ‘All Ps are Qs’ as and similarly the other sentences in the inference. Hence, letting P represent ‘Cretan’, Q represent ‘liar’ and another predicate symbol T represent ‘wicked’, we obtain the formalised version of (S):

Introduction

69

So far so good. But what about the syllogism in section 1? All Ps are Qs. Some Ps are Ts. ∴ Some Qs are Ts. We know how to deal with the ‘All Ps are Qs’ of the first premise, but what about the ‘Some Ps are Ts’ of the second? Most people’s first thought is to formalise this analogously with ‘All Ps are Qs’, i.e. as being (mis)led by the apparent grammatical similarity of the two types of sentence, where the only difference seems to be in the initial quantifiers ‘Some’ and ‘All’. But is definitely wrong and it is easy to show why. Consider the false sentence ‘There is an even positive integer not divisible by two.’ In the domain of the positive integers, let P be the property of being even and T that of not being divisible by two. Thus we have a statement of the form ‘Some Ps are Ts.’ But is true in the domain of the positive integers and so cannot represent the logical form of ‘There is an even positive integer not divisible by two.’ (It is easy to show that is true. First, we know that 3 is a positive integer which is not even. Let the constant a denote 3. So we know that ¬P(a) is true. Hence ¬P(a)∨T(a) is true, because for any sentences denoted by sentence letters A and B, ¬A∨B is a truthfunctional consequence of A. But ¬A∨B is truth-functionally equivalent to A→B. Hence we know that P(a)→T(a) is true. Define the predicate G(x) to be P(x)→T(x). Thus G(a) is true and hence so is (compare exercise 3(i) above); i.e. is true.) An interesting lesson of this demonstration is that grammatical form is not always a good guide to logical form, for we see that there is more than a quantifier difference between the logical structure of ‘All Ps are Ts’ and ‘Some Ps are Ts.’ So what formula does exhibit the logical structure of ‘Some Ps are Ts’? This is not difficult to answer. ‘Some Ps are Ts’ says that there is at least one P which is also a T, i.e. there is at least one individual x in the domain such that x is a P and x is a T. We can straightforwardly transcribe this statement into our logical notation, whence we obtain the formula The syllogism is therefore rendered:

But now suppose we are asked to formalise the two sentences ‘All Ps are Qs’ and ‘Some

Logic with trees

70

Ps are Qs’ as isolated sentences, (i) subject to the constraint that the domain of the variables is in each case to be the set of Ps (we assume that it is not empty), and (ii) with the domain unspecified. (i) In the domain of Ps, ‘All Ps are Qs’ says that everything is a Q, while ‘Some Ps are Qs’ says that something is a Q. Thus ‘All Ps are Qs’ becomes simply and ‘Some Ps are Qs’ becomes . (ii) The answer is underdetermined. ‘All Ps are Qs’ could be or it could be if you want to make the domain the set of Ps—and there is no reason either implicit or explicit in the question why you should not. Similarly, ‘Some Ps are Qs’ is legitimately either or . If, however, ‘All Ps are Qs’ occurs not as an isolated sentence but in the context of an inference, then the following rule must be observed: the quantified variables must all refer to the same domain throughout, just as we should take the unformalised sentences as referring to the same domain throughout. Thus in the syllogism (S) it would be definitely wrong to render ‘All Cretans are liars’ as taking the domain to be Cretans and Q(x) the predicate ‘is a liar’, since the next premise states something about the members of a different class, that of the liars themselves. In formalising this syllogism, therefore the predicates ‘being a Cretan’, ‘being a liar’ and ‘being wicked’ must all be regarded as predicates defined in a common domain. Now let us try something with a more complex structure. Formalise ‘Some people like everyone who likes them’ subject to the constraints that (a) the domain is one of people only, and (b) the only relation or predicate symbols you are allowed to use are a single binary relation symbol L, where L(x, y) is to be read ‘x likes y’. The following paraphrase is a useful first step: ‘There is at least one person x such that for every person y, if y likes x then x likes y.’ Since we are now considering a domain consisting of people, explicit mention of the fact that x and y are people is unnecessary and we get ‘There is at least one x such that for all y, if y likes x then x likes y.’ Now we can translate term by term, obtaining;

The logical structure of a sentence determines what follows deductively from it. Sometimes, however, that structure may not be made obvious by its vernacular expression, as we noted earlier. A particularly instructive example is found in what grammarians call adverbial constructions. For example, consider the following English sentences: ‘Minerva is thinking deeply’, ‘Matilda is waltzing slowly’ and ‘It is raining heavily.’ Clearly, they respectively imply that Minerva is thinking, that Matilda is waltzing and that it is raining. How are we to formalise the sentences to bring out these logical properties? One’s first answer is likely to be that ‘Matilda is waltzing slowly’ has the form P(a), where a is a constant representing Matilda and P is a predicate symbol representing the property of waltzing slowly. The trouble with this answer is that it is powerless to reveal why ‘Matilda is waltzing’ is a deductive consequence of ‘Matilda is waltzing slowly.’ For

Introduction

71

there is no way to extract from P(a) the information that waltzing is part of P. P itself has no ‘parts’; it is just a letter. Since ‘Matilda is waltzing’ obviously is a consequence of ‘Matilda is waltzing slowly’, we seem justified in inferring that P(a) does not faithfully represent the logical form of ‘Matilda is waltzing slowly.’ A more careful analysis is needed. Let us go back to grammar. Words ending in ‘-ly’, like ‘slowly’, ‘deeply’, ‘heavily’, etc., are adverbs; they qualify verbs, in this case the verbs ‘is walking’, ‘is thinking’ and ‘is raining’. Verbs describe actions or processes, and hence the logical way to parse adverbial sentences is as statements asserting the existence of actions and processes possessing the relevant properties. ‘Minerva is thinking deeply’ gets parsed as ‘There is a process which is a thinking process, which is deep and which is currently being undergone by Minerva’; formally, , where the domain includes processes (however we want to think of these) and T represents the predicate ‘is a thinking process’, D(x), that x has depth in some relevant sense and Q(x) that x is a process currently being undergone by Minerva. It is fairly obvious that (‘Minerva

is

thinking’)

is

a

logical

consequence

of

—we shall soon be able to prove this formally—and so our original problem is solved. The other adverbial sentences above can be dealt with similarly. But some people are wary of a logical analysis that seems to commit them to what they see as a metaphysical position, in this case the claim that actions and processes enjoy real existence. But all the formalisation has done is to make explicit what is implicit in our ordinary speech. For in ordinary speech actions and processes are certainly things to which we assign properties and place in relation to other things. This sort of commitment pervades general usage (‘Actions speak louder than words’, ‘Gluttony is a deadly sin’, etc.), whether we like it or not. But if we don’t, we shouldn’t blame the logical analysis; it merely brings out what is already there. There is another way of analysing the logical structure of adverbial sentences where there exists some scale of measurement of the quantities mentioned. Consider, for example, the sentence ‘The train is moving quickly.’ Physicists would most probably understand a sentence like this as describing the speed, or velocity, as they would term it (velocity is speed in a given direction), at which the train is moving. For them the sentence will therefore say something like ‘there is a velocity r such that v(train) is in that (vague) range of values corresponding to our (vague) concept of going quickly (quickly for trains, that is, not for supersonic aircraft)', where v is the velocity function. We can represent ‘v(train) = k’, where k is a number, as a binary relation V(train, k), where V(a, b) holds between any pair (a, b) of individuals just in case a is a material thing and b is a number measuring the velocity of a. So now we can formalise ‘the train is moving quickly’ as , where the domain consists of numbers and material objects—and maybe more besides; where a denotes the train; and where Q(y) is true for any individual y in the domain just in case y is a number falling in the range ‘quick’ when measuring velocities. Clearly, ‘the train is moving’ is formalised in this style as , which is, as we shall soon be able to show formally, a deductive

Logic with trees

72

consequence of In its intended interpretation refers to a domain containing material objects and the values, whether actual numbers or not, of some scale of measurement. Such ‘mixed’ domains are, if only implicitly, referred to widely in ordinary discourse. Consider, for example, Abraham Lincoln’s celebrated observation that you can fool all of the people some of the time and some of the people all of the time, but you can’t fool all of the people all of the time. Lincoln’s remark refers to both people and times and the domain of its quantifiers must consequently include both types of entity. Since we allow only a single domain for the quantifiers, these subdomains must be embraced within a ‘super-domain’ containing both types of entity, times and persons. These can then be regarded as subsets of the wider domain, distinguished formally by predicate symbols T and P respectively. We can now formalise Lincoln’s utterance as

where F(x, y) represents the binary relation ‘x can be fooled at y’ (we shall assume F(x, y) is simply false when x is not a person or y is not a time). The fact that we can introduce time into the formal discussion in this way means that we can capture within a first-order scheme a very important area of ordinary discourse that might seem otherwise out of our reach: tensed utterances. The following three statements are obviously very different in meaning. ‘Rachel went to the cinema’, ‘Rachel is now going to the cinema’, and ‘Rachel will go to the cinema’; the first is in the past, the second in the present and the third in the future tense. A subtheory of modern formal logic called temporal logic has sprung up in the last half-century or so, which adds primitive temporal operators to the usual battery of logical items, the connectives and quantifiers, in order to formalise sentences such as these. But quantifying over times, as domain objects, achieves just the same end and requires no extension of the logical vocabulary. In the process tensed statements become untensed; indeed, they become essentially timeless. Thus, the first of the three tensed statements about Rachel can be expressed as ‘There is a time t before now (t0) such that at t Rachel goes to the cinema’ and is then readily formalised as

where a is a constant denoting Rachel, S(a, t) says that the person a goes to the cinema at time t, t0 is another constant signifying the present time relative to some method of measuring time, like the usual date and clock one, and R(t, t0) says that t is before t0 according to this standard of measurement. Note that no additional predicates T(t), i.e. ‘t is a time’, or P(a), ‘a is a person’, need be introduced explicitly, since the status of t and a is built into the interpretation of the relation symbols R and S. The formalisation of the remaining two statements about Rachel is left as an exercise. Reasoning about time according to modern physics involves a larger set of relations

Introduction

73

and predicates. These predicates and relations are those of modern mathematics and the logical structure of mathematical reasoning deserves a separate treatment, which we shall consider later. But there is nothing in this sort of reasoning, apart from its complexity, that poses any difficulty of principle in representing it within the framework of a firstorder language. However, there are other constructions in English that pose more of a challenge to first-order formalisation. We have already come across one type, the socalled counterfactual conditionals. Others are modal statements, i.e. assertions of possibility and impossibility and finally statements involving probabilities. All these topics are extensive and have had whole books written on them. Some attempt will be made in the final chapters to discuss them without going to book length to do so. Exercises 1 Explain why, if the domain of ‘All Ps are Qs’ is some set D, the sentence is true if there are no Ps in D. 2 Formalise the following sentences. Take the quantified variables to range over a domain of people, and use the constant a to represent Jane and binary relation symbols B, S, O and Y in such a way that B(x, y) stands for the relation ‘x is a brother of y’, S (x, y) for ‘x is a sister of y’, O(x, y) for ‘x is older than y’ and L(x, y) for ‘x likes y’. (i) Jane has a brother. (ii) Jane has no sisters. (iii) Some people like all people. (iv) Some people are liked by nobody. (v) Nobody is their own brother or sister. (vi) Some people have no brothers. (vii) Some people have no sisters older than them. (viii) Some people have brothers older than them whom they like. (ix) Some people like no one’s brother, but there are sisters of some people who are liked by everybody. (x) Some people like no one who likes themselves. (xi) Everyone likes everyone who likes someone. 3 Formalise the following using the relations, predicates and constants indicated: (i) Minerva is thinking deeply (domain: processes; M(x): x is undergone by Minerva; D (x): x is deep). (ii) Carla got home at 5p.m. yesterday (domain: times and people; S(x, y): x is a person and y is a time and x gets home at y; a: Carla; b: 5p.m. yesterday). (iii) Frank has seen the film and won’t see it again (domain: times and people; R(x, y): x is a time and y is a time and x is before y; S(x, y): x is a time and y is a time and x is the same as y or after y; T(x, y): x is a person and y is a time and x sees the film at y; a: Frank; b: the present time).

Chapter 6 First-order languages: syntax and two more tree rules 1 FIRST-ORDER LANGUAGES In the previous chapter we showed how we could represent more of the logical structure of English sentences, more, that is, than truth-functional structure, in a formal notation containing, besides truth-functional connectives, also predicate symbols and relation symbols of arbitrary numbers of places, variables, constants and quantifiers. These form the basic vocabulary items of a class of formal languages called first-order languages, whose syntax and semantics we shall investigate in this and the following chapters. Syntactically, a first-order language is like a propositional language in that it is the set of all sentences which can be constructed from some class of ‘atomic’ components using a specified set of logical operations. However, there are two important differences: first, the set of connectives is fixed, the same for all first-order languages; and second, the atomic sentences of a first-order language are now not sentence letters, single and indivisible, but themselves constructed from a specified vocabulary of logical and extralogical items. The extralogical items are themselves sub-divided into a ‘descriptive’, or referential, part and a structural part. These categories of vocabulary item are specified as follows (the boldface capital letter L refers to an arbitrary first-order language): (i) L’s logical vocabulary contains the same connectives ∧, ∨, ¬ and → as the standard propositional language of Chapter 3 and in addition the two quantifiers and ∃. (ii) The referential part of L’s extralogical vocabulary consists of a set of predicate and n-place (n>1) relation symbols (how many of each may vary from language to language, though there may be infinitely many of both and there must be at least one predicate symbol if there are no relation symbols and vice versa) and a set (possibly empty) of constants. The exact nature of L’s predicate and relation symbols need not concern us; all the discussion of them is carried out in the metalanguage (Chapter 3, section 2) and in this metalanguage we shall use the capitals P, Q and if necessary also P1, P2,…, Q1, Q2, …etc. to refer to distinct predicate symbols of L. Relation symbols of L will be referred to by capitals R and S and if necessary also R1, R2,…, S1, S2…. The number of places of any relation symbol will be assumed known without needing explicit signalling by means of a dedicated notation. Constants of L will be represented by lower-case letters a, b, c from the beginning of the Roman alphabet and if we run out, a1, a2,…, b1, b2,…, c1, c2, …. (iii) The structural items in L’s extralogical vocabulary are two brackets (), the comma, and an indefinitely large supply of variables. We shall represent distinct variables, as before, by distinct lower-case letters x, y, z,…from the end of the Roman alphabet and by

First-order languages

75

x1, x2,…if we run out of these. Define an expression of L to be any finite string of symbols from L’s vocabulary. Some of these will be ‘meaningful’, like for example, if L contains the predicate symbol P; others will not, like x) xx, xR. We shall now proceed in stages to identify these ‘meaningful’ strings and in particular those of which it can sensibly be said that they are true or false when interpreted in an appropriate domain. A notational convention: in this and the following chapters, italic capitals A, B, C…from the beginning of the Roman alphabet will be used to denote arbitrary expressions. In more precise terminology, A, B, C…are metalinguistic variables ranging over the set of expressions of L; however, like Horace who saw and approved the better and followed the worse, we shall generally continue in the sloppier way to talk about arbitrary expressions, sentences, languages, etc. The potential truth- and falsity-bearing expressions of L are what we are really interested in. By analogy with their informal counterparts these will be called the sentences of L. Rather than defining them directly, it is easier first to take a detour via a larger class of expressions called the formulas of L. Recall from the previous chapter that an English sentence of the form ‘All Ps are Qs’ can be formalised as a universally quantified conditional . We can think of as built up from the basic vocabulary of a first-order language in the following increasingly large ‘pieces’ . These pieces will be called formulas of L and the pieces en route subformulas of the final formula. Like the corresponding class of sentences of a propositional language, the class of formulas of L can be uniquely specified by an inductive definition (cf. Chapter 3, section 3). First we define the class of expressions which are unconditionally formulas of L. These are called the atomic formulas of L and they are all expressions of the form P(t), R (t1,…, tn), where R is an n-place relation symbol, for those values of n>1 such that L has relation symbols of those numbers of places and where t, t1,…, tn are any constants or variables of L. An expression A is now said to be in the class F of formulas of L if and only if A is either (i) an atomic formula of L, or (ii) of the form ¬B, (S∧C), (B∨C), (B→C), , where x is any variable and B and C are formulas of L. Brackets are placed around A∧B, A∨B and A→B in (i) and (ii) so that the subformula structure of each formula in L is determinate: the subformulas of a formula A can be defined explicitly as all the nodes on A’s ancestral tree (this is like the ancestral tree of a sentence in a propositional language, except that xB and ∃xB each have a single vertical branch down to B). As before we shall omit outer brackets in ordinary discussion, writing ‘the formula A→B’ rather than ‘the formula (A→B)’. To aid the eye, we shall sometimes alternate curved and square brackets [] in complex formulas. Where are formulas of L, the quantifiers and ∃x are said to have an initial occurrence in them (they may also have other occurrences in these formulas). The scope of those initially occurring quantifiers and ∃x is in each case said to be the occurrence of the subformula B immediately following each of them. A variable is said to occur in a formula if it appears in that formula at some point other than that immediately following a quantifier; for example, xP(x) has only one occurrence of x. An occurrence

Logic with trees

76

of a variable x in a formula A is said to be free if it is not in the scope of any quantifier ∃x in A. An occurrence of a variable is bound if it is not free, i.e. if it is not in the scope of a quantifier formula

. Thus there are four occurrences of a variable in the

three of which are free and one bound; the second occurrence of y is bound. From the way freedom and bondage for variables are defined, it is clear that every occurrence of a variable in a formula is either free or bound. A formula which has a free occurrence of some variable is said to be open. A formula which is not open is closed. The closed formulas are also called the sentences of L. The sentences of L, so defined, are so called because they will be the expressions of L which can be true or false, depending on the interpretation of the predicate and relation symbols in them. But in that case the definition of F seems definitely over-permissive, for it includes as closed formulas expressions like or even . These ‘sentences’ seem to make very little sense. They are included in F because to exclude them would make for a very complicated definition of formula and, as we shall see in the next chapter, they do in fact make perfectly good sense;

will turn out to say

the same as and the same as P(a). However, such formulas can easily be avoided in practice and we shall not be bothered by them. We end this section by introducing some notational conventions which will be useful in the subsequent discussion. We shall signify by A(x1,…, xn) an arbitrary open formula of L with free occurrences of the variables x1,…, xn. Thus A(x) signifies a formula free in just the one variable x. Where A is any formula and t a constant or variable, A(t/x) signifies the result of substituting t for every free occurrence of x in A; if A has no free occurrence of x we shall regard A(t/x) as just A itself. Where A is known from the context of the discussion to have free occurrences of x and only of x, i.e. where A is A(x), we shall usually write A(t) instead of A(t/x). Exercises In the following assume that the relevant first-order language contains all the constants and predicate and relation symbols mentioned. 1 Explain carefully how by reference to the clauses (i) and (ii) in the definition above of formulas of L, you can determine that the expression is a formula of L. List all its subformulas. 2 What is the scope of the quantifier x in each of the following? (a) (b) (c)

First-order languages

77

(d) (e) 3 All the sentences in question 1 are of the form

.

(i) What is A(a) in each case? (ii) What is B(a/x) in each case? 4 Specify the scope of each occurrence of a quantifier in the formula and also indicate all the free occurrences of each variable. 5 Indicate all free occurrences of a variable in 6 Is the formula R(a, b) open or closed?

.

2 TWO MORE TREE RULES We shall now introduce tree rules for the two quantifiers, in each case by a pair of conjugate diagrams. We shall work backwards to them by supposing that we have generated a finished tree from a set Σ of first-order initial sentences, in which there is an open branch B. Recall that an open branch in a truth-functional truth tree determined a ‘world’ in which all the initial sentences were true; ‘world’ is in quotes because it was really just an assignment of the value T to each literal on the branch. We shall suppose that B also furnishes a ‘world’ in which all the sentences in Σ are true, but in this case one which is a bit more like a world, with a domain of individuals and predicates and relations defined in that domain. It will also be a ‘small world’ in the sense, roughly the same as that which economists give the term, that the only individuals in it will be those named by a constant appearing on B. This B-world is of course an interpretation of the first-order language whose predicate and relation symbols are those of the sentences in Σ. In the B-world a universally quantified sentence in that language is true if A(a) is true for every constant a on is false is true) if B (this is the ‘small world’ assumption), while for some constant c on B A(c) is false (¬A(c) is true). Thus we obtain a pair of unsigned conjugate diagrams for the universal quantifier:

The set-theoretic notation {A(a): a is a constant on B} is read ‘the set of all A(a) where a is a constant on B’. Similarly, an existentially quantified sentence true for some constant c on B, while it is false

is true in the B-world if A(c) is true) if A(a) is false (¬A(a) true)

Logic with trees

78

for every constant on B; and so we have the following pair of conjugate diagrams for the existential quantifer:

Notice that, as with the diagrams for the connectives, we can also read these diagrams downwards, as saying that if the upper sentence in each is true, so are all the lower ones. This is important, because, as in the earlier truth-functional case, it will enable us to interpret a closed tree as signifying that the initial sentences cannot all be true together; in other words, the tree rules represented by these diagrams are sound. Call a sentence of the form

or

a γ sentence and one of the form

or a δ sentence. In the table below we define corresponding sentences γ(a) and δ(a), where a is any constant of L:

γ(a) and δ(a) are called the instantiations of γ and δ with the constant a. We can now collapse the four unsigned quantifier diagrams into two:

Call {γ(a): a is a constant on B} the descendant of γ in (i) and δ(c) the descendant of δ in (ii). We can adopt diagram (ii) as a new tree rule, which we shall call the rule (δ), on the provisional hypothesis that the sentence to which (δ) is applied is a node on some eventually finished open branch which can be identified with B above. Of course, the hypothesis may be false, for all the branches passing through that node may close. But if it is not false we must some-how write into the statement of (δ) that the choice of c must be made in such a way that nothing else on B conflicts with c’s role of satisfying the condition A(x) or ¬ A(x), as the case may be. The following condition turns out to be necessary and sufficient: the constant c in δ(c) must not be one which has already appeared on the branch above the point at which δ(c) is placed. The condition is necessary, because otherwise we could have this:

First-order languages

79

The proof that the condition is sufficient must wait until Chapter 8. Diagram (i) cannot, however, as it stands be used as a tree rule. There is nothing wrong with the descendant of γ being a set of sentences rather than a single sentence: after all, the descendant of an a sentence is a set, {α1, α2}. The problem with (i) is that there may not be a finished open branch in the tree (it may close) and even if there is we may not know, at the stage in its development at which we want to apply (i), which constants are on it. We can’t simply identify the set of constants on a branch, open or closed, with those in the initial sentences, for we now know that new constants not yet appearing may subsequently have to be added by an application of (δ) Fortunately, we can modify (i) to get round these difficulties quite easily, while still remaining in the spirit of the enterprise. We simply allow the instantiations γ(a) of γ to be introduced piecemeal on any branch as these new constants get added (if they do) to it. When and only when γ has been instantiated with every constant on the branch (including one introduced specifically for that purpose if there would otherwise have been none) shall we say that γ is used on that branch. This is by contrast with the other tree rules, where the sentences to which they are applied are used on a branch as soon as their descendants are placed on it. In the light of all these considerations, we can formulate the tree rules (γ) and (δ) as follows (N.B.: B is now the as yet unfinished branch on which the descendant in each case is placed):

As earlier, a tree will be said to be finished when either it closes or every usable sentence on every open branch is used on that branch. This is still not quite the final form of these rules, but it is final enough for our purposes now. Two features of the rules should be noted. First, though the constraints on them are inspired by semantic considerations, the rules themselves are purely formal (syntactical). They can be implemented without reference to any interpretation of the language; all one has to know at the point of applying them are the sentences so far generated on the branch. Second, their validity is not restricted in any way by the ‘small

Logic with trees

80

world’ assumption made at the outset; it will turn out that if a set of first-order sentences is true in any world then it is true in a ‘small’ one and conversely (this is the content of a celebrated result called the Löwenheim-Skolem Theorem, which we shall prove in Chapter 8). To get a feel for how the rules work, we shall construct a closed tree from the initial sentences

:

Two features of this tree deserve comment. (i) There is a clear strategic advantage to applying (δ) before (γ). This is not only because once the (δ) rule has been applied to a sentence that sentence is used once and for all, but also because giving the (δ) rule priority minimises the number of individual constants which have to be introduced. (ii) We have extended the (α) and (β) rules in a natural way to first-order sentences: (β) was applied to the sentence P(a)→Q(a), β1 being ¬P(a) and β2 Q(a). From now on the (α) and (β) rules are applied to any formulas which have the appropriate truth-functional form: for (α), of conjunctions, negations of disjunctions and negations of conditionals; and for (β), of disjunctions, conditionals and negations of conjunctions. It might seem from this example that first-order truth trees are just about as well behaved as trees for sentences in the standard propositional languages. Well behaved they are, as we shall see in due course, but they are not quite as well behaved as the purely truth-functional trees. In the first place, to ensure that trees which can close do close we shall need to impose conditions on the order of application of the tree rules. Second, we now have to contemplate infinite trees, as the following example shows. Suppose we try generating a truth tree from the single initial sentence happens:

. This is what

First-order languages

81

The ‘etc.’ signifies that the tree goes on for ever! Clearly, every time one of the new constants introduced by the application of the (δ) rule instantiates the initial γ sentence , it gives rise to yet another δ sentence, which then introduces another constant and so on ad infinitum. The possibility that infinite trees may be generated from finite sets of initial sentences might seem to introduce an uncontrollable dimension into the theory of first-order truth trees, but actually this is not so. For the single-branched tree above, though infinite, is none the less well behaved enough. It is unambiguously a finished open tree: the initial γ sentence on it is definitely used, according to the criterion laid down earlier, since for every constant on the branch, the instantiation with that constant of the γ sentence appears on the branch. Exercises 1 For each of the following, state whether it is a γ or δ sentence, or neither of these. (a) (b) (c) Q(b)→R(a, b) (d) (e) (f) (g) ¬P(a) 2 For each of the γ and δ sentences in question 1, what are γ(a), δ(a)? 3 In two lines of the apparently closed tree below, generated from the initial sentences and

, a rule has been misapplied. Identify the line and

Logic with trees

82

explain how the rule is misapplied.

4 Identify a domain and a binary relation defined in it, such that in that interpretation both and

are true.

3 TREE PROOFS Extending the terminology of Chapter 4, we shall say that there is a tree proof of a conclusion C from a set Γ of premises if the set of initial sentences consisting of Γ together with ¬C generates a closed tree, using any of the rules of (α)–(δ) and Double Negation. The symbolism Γ├C will mean that there is such a tree proof of C from Γ. If there is such a proof, it follows from the Soundness Theorem that we shall prove later that there is no interpretation of the first-order language in which premises and conclusion are formalised in which those premises are true and the conclusion false; i.e. C is a valid inference from Γ. We shall end this chapter with tree proofs for some of the inferences we discussed earlier, starting with one for the syllogism (S) as formalised in Chapter 5, section 2:

First-order languages

83

That was easy. So is the tree proof for the other syllogism of Chapter 5 (sections 1, 4): All Ps are Qs. Some Ps are Ts. ∴ Some Qs are Ts. which will be left as an exercise. The next tree proof we give is for (*), Chapter 5, sections 1, 3:

We shall end by stating two useful facts about binary relations and proving one of them. Call a binary relation, symbolised by R, reflexive on a domain D if the sentence is true in D. R is irreflexive on D if x) is true in D. R is symmetric on D if is true in D; R is asymmetric on D if is true in D. R is transitive on D if

Logic with trees

84

is true in D. R is intransitive on D if is true in D. The two useful facts are that (i) if R is asymmetric on D it is irreflexive on D and (ii) if R is transitive and irreflexive on D it is asymmetric on D. We shall prove (i) by giving a tree proof of irreflexivity from asymmetry:

(ii) is left as exercise 7 below. Exercises 1 Show that (i)

(ii)

(iii)

(iv)

(v)

(vi)

2 Show that interchanged) and that

and conversely (i.e. with premise and conclusion and conversely.

3 Show that (cf. Exercise 1, Chapter 5, section 4). 4 A set A is a subset of a set B (standardly symbolised A⊆B) if every member of A is a member of B. Consider a domain D consisting of arbitrary things and sets. Let R(x, y) be true in D just when y is a set and x is an element of y (in mathematics textbooks this relation is written x∈y). So we can formalise the statement ‘y is a subset of z’, i.e. ‘for all x, if x is an element of y then x is an element of z’, as It can be proved from the axioms of set theory that there is a unique set which has no members and this is called the empty set (we saw earlier that it is given the conventional symbol Ø); i.e. where a denotes Ø,

is true.

First-order languages

85

Show by a tree proof that set is a subset of every set. 5 Give a tree proof of the inference (**) in Chapter 5, section 3. 6 Show that

i.e. the empty

(i) (ii) (iii) (iv) (v) Show that (iv) and (v) remain true when

is replaced by ∃.

7 Give a tree proof which establishes that every transitive irreflexive binary relation is asymmetric. 8 Let Σ be some set, possibly empty, of sentences. Show that (i) If there is a tree proof of a sentence A from a sentence B together with Σ then there is a tree proof of B→A from Σ. (ii) If there is a tree proof of a sentence C from a sentence A together with Σ and of C from a sentence B together with Σ, there is a tree proof of C from A∨B together with Σ. (iii) If there is a tree proof of a sentence B from A(a) together with Σ, where a does not occur in the tree, then there is a tree proof of B from ∃xA(x) together with Σ.

Chapter 7 First-order languages: semantics 1 INTERPRETATIONS Chapter 5 referred to first-order languages interpreted in some domain. To generate the results of the next chapter we need to be a bit more precise about just what sorts of things interpretations of L are. To prepare the following discussion, suppose that L contains a binary relation symbol R and consider the sentence (i.e. closed formula) of L. As it stands it has no truth-value, because no domain has been specified. But it automatically acquires a truth-value once we specify a domain and a binary relation defined in the domain as the interpretation of R. Now suppose L also contains a constant a. For the sentence ∃xR(x, a) to have a truth-value in that domain a must be made to refer to some individual in the domain. We can generalise these observations as follows: specifying an interpretation of a first-order language will mean specifying a domain and interpretations in that domain of the extralogical vocabulary of that language. We shall use the capital Gothic to refer to a generic interpretation of L. The domain of

will be written

and the binary relation defined in

interpreting R will be

written . An interpretation of L which we shall sometimes use for illustrative purposes is N, whose domain is N, the set {0, 1, 2, 3,…} of natural numbers and in which R N is >; i.e. R(x, y) is interpreted in N as the relation x>y. It is not difficult to see that so interpreted the L-sentence is true, and is false: the former sentence says (in N) that for every natural number there is a greater, which is true, while the second says that some natural number is greater than every natural number, including itself, which is false. However, in Chapter 6 it was shown that the sentences

and

generate a closed tree. From this and the Soundness Theorem proved in Chapter 8 we can conclude that all attempts to find ways of making those two initial sentences jointly true fail: there is no interpretation of L in which those two sentences are true. In particular, if

, there is no binary relation

of natural numbers such

that both those sentences are true in . We should pause at this point to think through the implications of this remark. What exactly does it mean to say ‘there is no binary relation of natural numbers such that …’? What is included in this class? There are many familiar binary relations of natural numbers, for example , ≥ a and identity =. With a little thought we can come up with some less familiar ones. Does ‘all binary relations of natural numbers’ mean merely those for which we currently have names? Surely not. Knowledge develops and our conceptual portfolio develops hand in hand

First-order languages: semantics

87

with it. It would be short-sighted, to say the least, to restrict a theory of deductive consequence to the items in that portfolio at any given time. On the other hand, the notion of an arbitrary binary or for that matter n-place relation on an arbitrary domain sounds so nebulous that to translate it into something concrete and acceptably objective would seem on the face of it a hopeless enterprise. Surprisingly, this turns out to be not at all the case. Indeed, the solution to the problem has been known for almost a century. The germ of the solution lies in a distinction first drawn by logicians centuries ago, between intensions and extensions. The intension of a property, for example being a person, is, roughly speaking, the meaning of the phrase ‘is a person’. The extension is the set of all things, in this case the set of people, that have the property. Similarly, the intension of, say, a binary relation is the meaning of a standard description of it. Its extension requires a little more consideration. Recall from an earlier discussion (Chapter 5, section 3) that implicit in the notation R(x, y) is that x and y in that order stand in the relation represented by R. It therefore seems natural to say that the things of which R(x, y) is true in any domain are ordered pairs of domain elements. We have reached an important point in the discussion, for ordered sets will play a central role in our theory of interpretations of first-order languages and to explain clearly what they are we need to make a brief digression into elementary set theory. In set theory the word ‘set’ unqualified means ‘unordered set’ and it is customary to signify these by using curly brackets {} to enclose the terms denoting their members; we have already used these set brackets in earlier chapters. Thus the unordered pair consisting of Cain and Abel, say, is written {Cain, Abel} and because it is an unordered set, {Cain, Abel}= {Abel, Cain}. But the ordered pair of Cain and Abel, in that order, is written with curved brackets enclosing ‘Cain’ first and ‘Abel’ second, thus: (Cain, Abel). For ordered sets (u, v) and (v, u), (u, v)=(ν, u) if and only if u=v. But Cain ≠ Abel and so (Cain, Abel) ≠ (Abel, Cain). Indeed, the first pair is in the extension of many binary relations (for example, is or was a slayer of) of which the other is not. Another relevant feature which distinguishes ordered pairs from unordered ones is that the set {Cain, Cain} is not a pair at all: the set-theoretical Axiom of Extensionality says that an unordered set is uniquely determined by its members, from which it follows that {Cain, Cain} = {Cain}. By contrast, the ordered set (Cain, Cain) is a genuine pair and indeed there is a familiar binary relation defined in the domain D={Cain, Abel) of whose extension both (Cain, Cain) and (Abel, Abel) are members, that of identity=(nor is this the only one: being the same height as is another). The set of all ordered pairs of members of D thus has four members: (Cain, Cain), (Cain, Abel), (Abel, Cain), (Abel, Abel). Just as there is a set of all ordered pairs of members of D, so there is a set of all ordered triples, quadruples,…and in general n-tuples of members of D, for any positive n (the set of all 1-tuples we can regard simply as D itself). The set of all ordered triples of members of D has 8 members (Cain, Cain, Cain), (Cain, Cain, Abel),…, (Abel, Abel, Abel), of quadruples of members of D has 16 members and of n-tuples of members of D has 2n members. The set of all ordered n-tuples of D is written in set-theoretic notation as Dn; it is also called the nth Cartesian product of D with itself. That ends the set theory. We can now identify the extension of an n-place relation in a domain with the corresponding set of ordered n-tuples of domain elements which stand in

Logic with trees

88

that relation (formally, the notation R(x1,…, xn) suggests that R is a predicate of ntuples). Being sets, extensions seem to be admirably objective in character and also—an important bonus—well understood mathematically. Intensions, by contrast, seem to be just the sort of knowledge-dependent entities that we decided we did not want to base our logical theory on. The appropriate strategy in these circumstances is to apply what the philosophers call Occam’s Razor (allegedly introduced into philosophical debate by the Schoolman William of Occam, Occam’s Razor is the injunction not to multiply entities unnecessarily) and eliminate intensions entirely from consideration, identifying properties and relations straightforwardly with their extensions. The next step is to regard any set of n-tuples of members of some domain as the extension of some relation defined in it. Thereby the apparently nebulous notions of an arbitrary property and of an arbitrary n-place relation defined in a domain D are replaced by a well-understood and objective mathematical concept, an arbitrary set of the appropriate dimension (the dimension of a set of n-tuples is n; a subset of D itself is defined to have dimension 1). We are now in a position to define the notion of an interpretation of an arbitrary first-order language with complete generality; the definition, due originally to the Polish-American logician and mathematician Alfred Tarski, is as follows: An interpretation of a first-order language L is a rule specifying (i) a non-empty set

of individuals, called the domain of the variables of L. We

should recall from our discussion in Chapter 5 that the individuals in are not necessarily physical objects. They can be anything which can be conceptually individuated, concrete or abstract, like processes, actions, numbers, algebraic structures, thoughts, emotions, or what have you. (ii) for each individual constant c of L, a particular member is the individual in

of

. In other words,

named by the constant c of L.N.B.: there is no rule

preventing the same individual in

being the interpretation of more than one

constant, i.e. there is nothing to stop being the same individual in as (this should not worry anybody familiar with the custom of many people giving their offspring more than one name). (iii) for each predicate symbol P of L, a set specifies which individuals in

of individuals in

are to have the property P in

P(c), where c is a constant, will be true in

just in case

(iv) for each n-place relation symbol R of L (n>1), a set individuals in relation in

.

is in

—this subset of . Thus the sentence .

of ordered n-tuples of

specifies those n-tuples of individuals which determine the R-

. Thus, if c1,…, cn are constants and R an n-place relation symbol, R(c1,

…, cn) is true in

just in case the n-tuple (

) is in

.

A consequence of the definition of an n-place relation as a set of ordered n-tuples of

First-order languages: semantics

89

is that the empty set Ø automatically qualifies as an n-place relation. For, as we know from exercise 4, section 3, Chapter 6, Ø is a subset of any set and so qualifies both as a subset of

and as a subset of

all n-tuples of

, the set of

. To assign the empty set as the interpretation

symbol P means that no individuals in

will be Ps in

relation symbol of L, this means that no individuals in

; if

of a predicate

is Ø, where r is any

are R-related in

.

Exercises 1 Show that a tree generated from the set

closes. What does this tell you about the identity of

in any interpretation →¬Q(x)) true?

which makes the sentence 2 Show that a tree generated from the set

closes. What does this tell you about the identity of

if

is true

if

is true

in and is Ø? 3 Show that a tree generated from the set

closes. What does this tell you about the identity of

in and is the entire domain of ? 4 Let D be the set {0, 1}. List all the members of D3, i.e. all the triples or 3-tuples of members of D.

Logic with trees

90

2 FORMULAS AND TRUTH In elementary mathematics we are familiar with open formulas being true for given values of their free variables and false for others. For example, ‘x can even be defined in terms of the operator thus: A>B= (A→B). An alternative way of developing modal propositional logic is to take such an operator > as primitive and define and hence ◊, in terms of it as follows: A=T>A, where T is a tautology. Lewis proposed several inequivalent modal deductive systems, characterised, like H in Chapter 10, by sets of logical axioms and rules of inference and classified as S1–S5, but provided little if any independent semantic underpinning for them. The situation was changed dramatically in the fifties and sixties by Kripke, who provided a uniform semantic framework in terms of which it is possible to prove the completeness of various modal systems with respect to different interpretations of the modal operators within that framework. The heuristic motivation for this so-called Kripke semantics was Leibniz’s view that this world is merely one among a host of other possible worlds and that a statement is necessary if it is true in all possible worlds. Kripke’s innovation was to consider possible worlds as purely formal objects in the domain of a binary relation R called an accessibility relation, and he showed that by imposing stronger or weaker constraints placed on R, different classes of inter-pretation are obtained with respect to which familiar modal deductive systems can be shown to be sound and complete.

Beyond the fringe

159

We shall limit the discussion to modal propositional logic, since most of the interesting recent work done in modal logic is located there. Where R is an accessibility relation on a set W of worlds and L the modal propositional language generated from some set of sentence letters, the pair (W, R) is called a frame for L. An interpretation v in a frame F assigns a truth-value, for each world w in W, to every sentence X in L. First v assigns a truth-value to every sentence letter; write this as v(A, w)=T or F as the case may be. The truth-functional connectives are evaluated in the usual way: v(¬X, w)=T if and only if v (X, w)=F; v(X∨Y, w)= T if and only if v(X, w)=T or v(Y, w)=T, etc. For X= Y, v(X, w) =T if and only if v(Y, w’)=T for all w’ such that R(w, w'); informally, X is true in w if and only if Y is true in every world w' accessible from w. An interpretation v in F is said to be a model in F of X if v(X, w)= T for all w in W. X is said to be valid in F if all interpretations in F are models in F of X. The sentences valid in all frames in which R is reflexive constitute the modal system T. The sentences valid in all interpretations in which R is transitive and reflexive are those of Lewis’s S4. The sentences valid in all interpretations in which R is reflexive, transitive and symmetric, i.e. an equivalence relation, are those of S5. Each of these systems has an equivalent syntactical characterisation: as H-style formalisations, these various sets of sentences are those derivable from suitably chosen logical axioms by means of two rules of inference, modus ponens and Necessitation: if A is derivable, so is A. T’s logical axioms are all instances of (A→B)→( A→ B) and A→A. S4’s are those of R plus A→ A. S5’s are those of S4 plus ◊A→ ◊ A. For further information about traditional modal logic the reader should consult a good introductory text, like that of Chellas 1980, or Hughes and Cresswell 1972 (who also discuss so-called quantified modal logic, the extension of the modal operators to predicate languages). Anyone wishing to keep logic metaphysics-free might cast a wary eye on Kripke’s semantics for modal logic. One response is that sets of possible worlds with weaker and stronger accessibility relations on them are probably best looked on as algebraical structures which furnish a useful mathematical tool for investigating the relation between different modal systems. If this were the end of the story modal logic might by now have ceased to be of much interest to logicians. In fact, interest in it has never been keener, inspired by the fact that various interesting formal deductive systems can be interpreted in an illuminating way in a suitable modal system and vice versa (an interpretation of one formal system in another is a translation-function f which maps sentences of the language of the first system into the sentences of that of the second, in such a way that the theorems of the first system are translated into sentences of the second). For example, Intuitionistic logic is interpretable in S4, the translation exploiting the fact that Kripke models for Intuitionistic logic are closely related to S4 frames. Now Intuitionistic logic is allegedly a logic of constructive provability, which suggests that one way of regarding necessity is as provability from principles themselves regarded as a priori necessary. A fruitful way of exploring this idea is suggested by the following facts (in what follows, ‘A’ will be the numeral in PA for the Gödel number of A): (i) the provability predicate for the first-order Peano Axioms is definable in the language L PA of first-order Peano Arithmetic by a formula Pr(x) (Chapter 11, section 5); (ii) first-order Peano Arithmetic seems to be a prime candidate for the status of an a priori necessary body of knowledge; (iii) if A is provable (from PA) so is Pr(‘A’), mimicking the modal

Logic with trees

160

rule of Necessitation; and (iv) if Pr(‘A→B’) and Pr(‘A’) are provable so is Pr(‘B’), mimicking the fact that if (A→B) and A are modal theorems so is B. (i)–(iv) suggest that a substantial part of basic modal logic is interpretable in PA, with the formula Pr(x) in L PA interpreting the modal box. Some of the very deep results about the structure of provability obtained from the study of this interpretation are collected and clearly explained in Boolos (1993; this work also contains an excellent and very lucid account of the Gödel Incompleteness Theorems). But there are theorems of all the standard modal systems that do not translate into theorems of PA, for there are sentences A such that Pr(‘A’)→A is not a theorem of PA. It is not difficult to show why this is so (the result is originally due to Montague (1963)). First, some background. A famous earlier result of Gödel (known as Gödel’s Diagonal Lemma) is that for any formula F(x) of L PA there is a sentence B such that Pr(‘B’)↔B is provable from PA (we take ↔ to be defined in the usual way). In particular, ¬Pr(‘G’) ↔G is a consequence of PA where G is the ‘undecidable’ sentence of Chapter 11, section 5. A theorem of Löb (Boolos 1993:56) says that if Pr(‘A’)→A is a consequence of PA, for any sentence A, then so is A. Löb’s Theorem and Gödel’s First Incompleteness Theorem jointly imply that if PA is consistent then Pr(‘G’)→G is not provable from PA. Suppose, however, we add that sentence and indeed all instances of Pr(‘A’)→A to PA, since for every sentence letter A, A→A is a valid sentence in every traditional modal system and rightly so if necessity means anything familiar at all. Call the result of making all these additions the system M. M is now easily shown to be inconsistent. For by truth-functional logic ¬Pr(‘G’) is a consequence of Pr(‘G’)→G and ¬Pr(‘G’)↔G. So is G. Hence G is provable in M and so by (iii) above Pr(‘G’) is provable in M. Hence M is inconsistent. This startling result shows that for any language L for which necessity is a predicate of sentences of L, it cannot be one in which (a) is defined by a formula of L, (b) renders all the consequences of first-order Peano Arithmetic necessary truths and (c) satisfies all the most basic modal principles. It is tempting to regard this result as showing that assigning necessity the role of object-language operator, as traditional modal logic does, simply obscures the fact that it is essentially metalinguistic, like truth according to the Metalinguistic Theory. But this conclusion would be premature. Montague’s result does not demonstrate that there is not some sense or senses of necessity which can be expressed in a suitable object-language. Kripke’s semantics shows that there is. There are also more intuitive senses of necessity, like the sort of physical necessity which laws of nature are alleged to possess, for example, in which it is not true that the laws of arithmetic, even if true, are necessary and there have been explicitly modal accounts of physical necessity. There have even been modal interpretations of counterfactuals (Lowe 1983). 3 INDICATIVE CONDITIONALS AND → It has been conceded that the truth-functional → does not in general provide a good interpretation of counterfactual conditionals (at the same time, however, it seems at least open to doubt whether any definite truth-claims are made by counterfactuals, except when they can be interpreted as the conditional predictions made by scientific laws). A

Beyond the fringe

161

more radical objection to the truth-functional → is that it does not adequately represent even the non-subjunctive, or indicative, conditionals used in ordinary speech. Those who believe it does not employ a battery of informal examples which, they claim, are counterexamples to any formalisation using →. The following are a representative sample from the literature. I hope that by the end of their discussion it will be apparent that they are not counter-examples to →. (i) Suppose A is the sentence ‘I add sugar to my coffee’, B is ‘It [my coffee] will taste sweet’ and C is ‘I add diesel to my coffee.’ Here we seem to have a counterexample to the inference ‘If A then B.’ Therefore, if (A and C) then B’, which is deductively valid if the conditional is represented by the truth-functional →. Hence, we are asked to conclude, → does not represent the ordinary English conditional. Answer It is quite extraordinary that this could ever have been regarded as a serious objection, yet it certainly has been. At any rate, it is no counterexample, merely a failure to be explicit. ‘If I add sugar to my coffee it will taste sweet’ is accepted as true only because ‘and I add nothing else’ is tacitly added to the antecedent. What is really being asserted is a statement of the form ‘If A and D then B.’ The inference is therefore one of the form ‘If A and D then B. Therefore if A and not-D then B’, which is truthfunctionally invalid (‘I add diesel’ we can represent, at least for the purposes of the discussion, as being the negation of D). (ii) ‘If it rains then it won’t rain heavily. Therefore if it rains heavily then it won’t rain.’ The premise may be true, but the conclusion seems rather obviously false, contradicting the claim that all inferences of the form ‘If A then B. Therefore if not-B then not-A’ are deductively valid—as they are, of course, when formalised using the truth-functional arrow →. Answer This is slightly, but only slightly, an advance on the previous example. ‘If it doesn’t rain then it won’t rain heavily’ is something we will presumably accept as a necessary truth (the discussion of adverbial constructions, in Chapter 5, shows that when formalised in first-order logic it is a logical truth). From this and the original premise, ‘If it rains then it won’t rain heavily’, we can infer unconditionally, by a step that seems valid enough (the Rule of Dilemma: ‘If A implies B and the negation of A also implies B, then B’), that it won’t rain heavily. From ‘It won’t rain heavily’ we infer ‘If it rains heavily then it won’t rain’ by Absurdity (Chapter 10, section 3) and →-introduction (Chapter 10, section 3). There are people who will reject the answer to (ii) because they reject the Rule of Absurdity. Yet Absurdity seems completely justified by the provisional definition of deductive validity in the first chapter: clearly, if a premise cannot be true then a fortiori it cannot be true and any conclusion false. However, the inference ‘Not-A, therefore if A then B’ which it supports has been questioned for the following reason: (iii) Suppose A is ‘A Democrat will be the next US President’, and B is ‘The next US President will permit racial segregation’ (this is similar to an example in Edgington 1991:180). The problem here is that it seems that we can rationally accept ‘Not-A’ as true (depending on the state of the opinion polls) and no less rationally reject ‘If A then B.’ However, a basic principle of probability theory asserts that the probability of the conclusion of a deductively valid inference is at least as great as that of the conjunction of

Logic with trees

162

the premises (Howson and Urbach 1993:25). As we know, A→B is a truth-functional consequence of ¬A, so if we model the English conditional by →, then, if we accept ‘NotA’ as more probable than not, we are bound to accept ‘If A then B’ as more probable than not. But, in this example, ‘If A then B’ seems almost certainly false, quite independently of the probability of ‘Not-A’, which in the appropriate circumstances might be regarded as probably true. Surely these judgments are not really inconsistent? But if they are not, then it follows that we cannot model the English conditional by →. Answer We shall argue that while ‘If a Democrat will be the next US President then they will permit racial segregation’ seems obviously false, the grounds for judging that it is false do not on analysis support that conclusion. Further discussion will be postponed until after consideration of the next, related, example. (iv) Here is a well-known but too-easy proof of God’s existence. The sole premise is ‘It isn’t true that if God exists then we are free to do as we like’, which seems to be true. But ‘God exists’ is a truth-functional consequence of that premise if the latter is formalised as a negated truth-functional conditional ¬(A→B). Yet nobody in their right mind would believe this inference to God’s existence to be really valid. Answer to (iii) and (iv) The crucial issue in both these last two examples is whether the relevant factual data are properly expressed by asserting the falsity of an English conditional (or equivalently the truth of its negation): in the first example, of the conditional ‘If a Democrat will be the next US President then they will permit racial segregation’ and in the second, ‘If God exists then we are free to do as we like.’ Consider the second first. We presumably feel that our available evidence, as presented in sacred literature and/or the theologians’ interpretation of it, indicates that God’s existing would be incompatible with our freedom to do as we please; i.e. in conjunction with that evidence, ‘God exists’ implies the falsity of ‘We are free to do as we please.’ But then what we have grounds for believing true is not the negated conditional ‘It is not the case that if God exists we are free to do as we please’, but the very different conditional ‘If God exists we are not free to do as we please’; only from the latter is there a clear implication that we are not free to do as we please if God really does exist. Similarly, in (iii), what our evidence directly supports is not the negated conditional ‘It is not the case that if a Democrat will be the next US President then the next US President will reintroduce racial segregation’, but the conditional ‘If a Democrat is the next US President then the next US President will not permit racial segregation.’ In other words, what the evidence directly supports in each case is a conditional of the form ‘If A then not-B.' Nor do there seem to be any further grounds for asserting ‘It is not the case that if A then B.' ‘It is not the case that if A then B' certainly does not follow deductively from ‘If A then not-B’when ‘If ... then—’ is rendered by the truth-functional arrow, nor is there any compelling reason to think it should on any wider consideration. One reading which does make that inference valid is a Lewis-Stalnaker one, in which the conditional is parsed in the same way as either of their counterfactuals. However, even were the Lewis-Stalnaker approach acceptable (and there is one counterfactual that resists their treatment), which it is not if the earlier discussion of it is sound, the conditionals in these examples are not counterfactuals, nor does there seem any good reason why they should be treated as such. It might be objected that in ordinary English we readily make the inference from ‘If A

Beyond the fringe

163

then not-B’ to ‘It is not the case that if A then B.’ The objection to accepting this, even if it were true, which is doubtful, is not only that there appears to be no justification for such an inference, but that to adopt it as a rule would lead to incoherence, as the following example shows. Let A=B=¬(0=0). ‘If ¬(0=0) then 0=0’ seems to be something which, when thought through, is acceptable to anybody who accepts the ordinary theory of identity and the Absurdity Rule. If they also accept contraposition then they should accept ‘If ¬(0=0) then ¬¬(0 = 0)’, i.e. ‘If A then ¬B.’ But they would almost certainly not accept as true ‘It is not the case that if A then B’, since that is the same sentence as ‘It is not the case that if A then A.’ The lesson from this is, I believe, that what we accept and reject in the way of rules should not be based on consideration of an isolated example, but should instead be a decision constrained by more global considerations of how well it contributes to an overall consistent and acceptable theory. But this is to acknowledge the authority of methodological criteria and in particular two widely accepted as constraints on scientific theorising of any sort, generality and coherence. The earlier chapters have shown that many intuitively valid inferences involving conditionals can be successfully analysed using the purely truth-functional →. In the light of this and the previous discussion, it may well be (I actually believe it to be) that the truth-functional conditional is overall the best coherent model of inferences involving indicative conditionals and that where our intuitions conflict with its deliverances those intuitions may simply not offer the best, or even good, guidance. There is nothing bizarre about this suggestion: intuitions are frequently, if reluctantly, ignored in the face of a coherent theory which says that they are wrong. Intuitions about probability, for example, can be notoriously at odds with the theory of probability (a wealth of empirical studies shows just how divergent intuitions are from the theory), yet the broad consensus is that the latter is the sounder judge of what is correct. I suggest, tentatively, that the same is true in the present case and that the advantages which flow from the truth-functional account will eventually be seen to outweigh contrary intuitions. It should be said that the foregoing discussion of the relation between ordinarylanguage conditionals and the truth-functional → is heavily coloured by the author’s own views (though these have a certain amount in common with the theory of assertibility conditions for conditionals due to Grice and Jackson (Jackson 1991). There has been much debate of this topic and the reader is strongly encouraged to examine alternative accounts. Fortunately, there are two excellent anthologies, by Harper, Stalnaker and Pearce (1981) and by Jackson (1991), as well as a recent survey article by Edgington (1995). Sainsbury (1991) contains a thorough and clear discussion, as does Read (1994). 4 CONCLUSION First-order logic has, I believe, a good claim to be regarded as the formal model for an invariant and substantial core of informal deductive reasoning. Its fit is not everywhere perfect, but considering the largely unregulated manner in which human language and reasoning have developed and the variety of purposes to which they are put, a perfect fit is hardly to be expected.

Logic with trees

164

Almost every year, however, brings further claims that first-order logic is ‘dead’. One of the most recent is Devlin’s (1991), as a preamble to presenting his own theory of logic as a sort of very general theory of information-processing. One of the alleged deficiencies is the familiar problem of conditionals and I have argued that on inspection this does not issue in a general condemnation of the truth-functional →. Devlin’s second objection is more fundamental: it is to show that the whole idea of necessary truth-preservation from premises to conclusion is misconceived. His example is the inference (*) Jon walked into the restaurant. He saw that the waitress had dirty hands. So Jon left immediately. This, he claims, would be declared valid by any ordinary person untutored in classical logic, because it is an example of a type of reasoning that ordinary people habitually engage in. Clearly, it is not valid in first-order logic. Indeed, it is not deductively valid according to a much less formal criterion. Nobody who thought for more than a few seconds would say it was a deduction, for the simple and obvious reason that it is possible for Jon to enter the restaurant and see what he saw and yet not leave. He might know that the restaurant possessed an outstanding chef and regard the state of the waitress’s hands as a price worth paying for a superior cuisine. He might have an assignation with the waitress. He might not even like cleanliness. The possibilities consistent with his not leaving are endless. So Devlin’s objection amounts to saying that there are types of reasoning that are not deductive. But we know this; we have already mentioned inductive reasoning, which is the reasoning that we perform when we predict what will happen on the basis of evidence that seems to make such predictions highly likely but not certain. And (*) is a typical example of such reasoning: it is probable reasoning, as the eighteenth-century British empiricist philosophers classified it. But to admit that there is probable, or inductive, reasoning is no reason to deny that there is also deductive, demonstrative reasoning. Which, of course, there is. Nor is there any need to construct a theory of general reasoning which blurs the distinction between those two types. As we have suggested above, there is already a general theory of inductive and deductive reasoning which nevertheless maintains the distinction. Indeed, expounding first-order logic in isolation from the theory of probability is really only telling half the story, for the model of deductive reasoning provided by first-order logic interlocks with probability theory to provide a general account of both inductive and deductive reasoning. Probabilities are a natural and indeed indispensable tool in the theory of non-deductive inference, where evidence supports to a greater or lesser extent some explanatory theory, but does not entail its truth (Howson and Urbach (1993) is an introductory text for a well-known probabilistic account, called the Bayesian theory of uncertain inference). As we have seen in discussion of these examples, probabilistic considerations can be relevant in discussing the adequacy or otherwise of a deductive model and the dovetailing of the theory of probability with that model lends support to both—unfortunately, to an extent we can merely catch glimpses of in this book.

Notation Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

Chapter 6

Notation

Chapter 7

Chapter 9

Chapter 10

Chapter 11

Chapter 12

166

Notation

167

Answers to selected exercises Chapter 1 Section 4

1 False.

2 False.

No.

Section 5

True.

Section 6

1

Section 7

2

Section 8

1 If you want the answer to be ‘yes’ if and only if B is true, the question is ‘Is ¬ (A↔B) true?’

B→A

2 ↔ 3 (i) (ii)

4 ∧ 5 A∧B; A=Untidy work will cost you marks, B=Inaccurate work will cost you marks. 6 No. 7

(A∧B)→(¬C→D)

A=You are over 18. B=You are married.

Answers to selected exercises

169

C=You have already received benefit. D=Your name will go on the register.

Chapter 2 Section 2

1

Section 3

1

Answers to selected exercises

2 (i) The premise is redundant; 3

Section 4

170

(ii) the premises are unsatisfiable. (i)

Tautology.

(ii)

Tautology.

(iii)

Contradiction.

(iv)

Neither.

(v)

Tautology.

(vi)

Neither.

(vii)

Neither.

(viii)

Contradiction.

(ix)

Contradiction.

Chapter 3 Section 1 1

(i)

Two: A, B.

(ii)

Infinitely many.

(iii)

Infinitely many.

2 No. No, it does not mean that there is no biconditional sentence in L[A, B; ∧, ∨, ¬, →], since there are sentences in L truth-functionally equivalent to A↔B. 3 ¬A∨B⇔A→B; ¬(A∧A)∨B⇔A→B; ¬((A∧A)∧A) ⇔ A→B; etc. Section 3

(a)

Answers to selected exercises

(b)

(c)

Section 4

Section 6

1

(a)

A∧B, B∧A

(b)

A∧B, ¬B

(c)

None

(d)

A, B→C

(e)

A→(B→C)

1

2k

2

(i)

(A∧B)∨(¬A∧B)∨(¬A∧¬B)

(ii)

(A∧B)∨(¬A∧B)v(A∧¬B)

171

Answers to selected exercises

172

(iii)

(A∧B)

(iv)

(A∧B)∨(¬A∧¬B)

(v)

(A∧B∨(¬A∧B)∨(A∧¬B)∨(¬A∧¬B)

(vi)

A∧B∧C

(vii)

¬A∧¬B

1

Section 7

2 The shortest are (A|A)|(B|B); (A↓A)↓(B↓B). 3 The shortest X is A|(B|B). The shortest Y is (A↓A)↓B)↓((A↓A)↓B). 4 (i)

Neither.

(ii)

Tautology.

(iii)

Contradiction.

(iv)

Neither.

5 (i) (ii)

A∨B (¬A∧B∧C)∨(¬A∧ ¬B∧C)∨(¬A∧B∧¬C)

6 Because | and ↓ are binary connectives and (A|B)|C is not truth-functionally equivalent to A|(B|C), and (A↓B)↓C is not truth-functionally equivalent to A↓ (B↓C). Section 9*

1 ¬A∨B. 2 (¬A∨B)∧(A∨-B)∧(A∨B). 3 ¬A∨B∨C. 4 A∨¬A.

Chapter 4 Section 2

1

(i)

β

(ii)

Neither.

(iii)

β

Answers to selected exercises

2

Section 3

173

(iv)

Literal.

(v)

α

(vi)

α

(i)

Closed.

(ii)

Open.

(iii)

Closed.

(iv)

Open.

Answers to selected exercises

174

Answers to selected exercises

175

Hence truth-functionally valid (b)

Truth-functionally invalid. Two counterexamples: A-F, B-T, C-T, D-T A-F, B-F, C-T, D-T (c)

Answers to selected exercises

4

(a)

ABC TTT TFT FTT TFF FFT FTF FFF

Chapter 5 Section 1 (i) 2

Everyone is tall.

(ii)

Someone is broad.

(iii)

Someone is tall and broad.

176

Answers to selected exercises (iv)

177

Someone is tall and everyone is broad.

No. Section 1 (i) 3

True: it says that there is a pair of positive integers one of which is less than or equal to the other.

(ii)

True: it makes the same statement as (i).

(iii)

True: it says that every positive integer is less than or equal to itself.

(iv)

True: it says that for every positive integer there is one at least as large.

(v)

True: it says that for every positive integer there is one no greater than it.

(vi)

True: it says that there is a positive integer less than or equal to every positive integer.

2 Only (i) and (ii) remain true; the remainder are false. Section 4 2 (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix)

(x) (xi)

Note: There are sentences equivalent but not identical to each of (i)–(x), and if your answer to any of these is not as given above you should try to see whether it is equivalent to it. 3 (i) (ii) (iii)

S(a, b)

Answers to selected exercises

178

Chapter 6 Section 1 1 2

(a) P(x) (b) P(x)∧Q(x) (c) (d) (e)

3 (i) (a) P(a) (b) P(a)∧Q(a) (c) (d) (e) 6 Closed. Section 1 2

(a) δ (b) δ (c) Neither. (d) Neither. (e) δ (f) Neither. (g) Neither. 2 (a) (b) ¬(Q(a)→R(a, a)) (c) ¬(Q(a)∧P(a)) 3 (δ) is misapplied at lines 5 and 6: b cannot be used to instantiate line 5, since it already appears at line 4. 4 One such domain is N, the set of natural numbers, with R(x, y) interpreted as x

The London School of Economics and Political Science/ Routledge Books published under the joint imprint of LSE/Routledge are works of high academic merit approved by the Publications Committee of the London School of Economics and Political Science. These publications are drawn from the wide range of academic studies in the social sciences for which the LSE has an international reputation.

Logic with trees An introduction to symbolic logic

Colin Howson

London and New York

First published 1997 by Routledge 11 New Fetter Lane, London EC4P 4EE This edition published in the Taylor & Francis e-Library, 2005. “To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.” Simultaneously published in the USA and Canada by Routledge 29 West 35th Street, New York, NY 10001 © 1997 Colin Howson All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloguing in Publication Data Howson, Colin. Logic with trees: an introduction to symbolic logic/Colin Howson. p. cm. 1. Logic, Symbolic and mathematical. I. Title. BC135.H68 1996 96–7315 160–dc20 ISBN 0-203-97673-8 Master e-book ISBN

ISBN - (Adobe e-Reader Format) ISBN 0-415-13341-6 (hbk) 0-415-13342-4 (pbk)

To Minou, who would have enjoyed sitting on this book

Contents

Acknowledgments Introduction

viii ix

Part I Truth-functional logic 1 2 3 4

The basics Truth trees Propositional languages Soundness and completeness

3 15 31 47

Part Π First-order logic 5 6 7 8 9 10 11 12

Introduction First-order languages: syntax and two more tree rules First-order languages: semantics Soundness and completeness Identity Alternative deductive systems for first-order logic First-order theories Beyond the fringe

61 74 86 99 111 128 138 154

List of notation Answers to selected exercises References Name index Subject index

165 168 182 184 187

Acknowledgments I should like to express my gratitude to Tony Dale, Gustavo Fernandez and Tony Ungar for their detailed comments on earlier versions of this book. Other people who have offered very helpful advice and discussion are Rose Gibson, R.I.G. Hughes, Peter Milne, Margaret Morrison, Jan von Plato, Demetris Portides, Adam Rieger, Aldo Visintin, John Worrall, Elie Zahar, and many undergraduates and postgraduates of the London School of Economics. I should like to express my thanks also to Theresa Hunt, Pat Gardner, Cynthia Ma and Towfic Shomar for their assistance in preparing the typescript.

Introduction Logic was one of the first scientific disciplines to be identified and studied systematically. For various reasons, which historians of ideas still disagree about, the Stoic and Aristotelian beginnings were left undeveloped, and no real advances were made until over two thousand years later. Then, in the second half of the nineteenth century, a succession of mathematicians took up the subject, and as a result of their attentions it grew rapidly into a discipline of great power; it has generated results which have transformed the way we think, at a quite fundamental level. In particular, it has given us information about the limitations of theorising that could hardly have been imagined, even if the questions could have been formulated, only a century ago. Some of these results have been interpreted in extraordinary ways. In Douglas Hofstadter’s best-seller Gödel, Escher, Bach (1979), two celebrated theorems of Gödel have been compared to the works of Bach. Elsewhere, they have convinced some people that we are more than machines, and others that we are no more than machines. On a more practical level, however, logic is now acknowledged to be of central importance, particularly in computer science and artificial intelligence, and anyone who wants to work in the area of software development will have to have an increasingly considerable degree of acquaintance with it. ‘Logic’, as the foreword to one of a rapidly increasing number of recent texts on logic oriented towards applications in computing attests, is ‘the calculus of computer science’ 1 . Indeed, because of its central role there, logic is now playing a similar enabling role in the information technology revolution to that which mathematics played in the scientific revolution of the seventeenth and eighteenth centuries. Logic texts exemplify a variety of proof systems. The one used in this book is the increasingly popular, arguably the most user-friendly, and the most obviously machineimplementable system, based on the semantic tableau method pioneered by the Dutch logician Evert Beth. It has been developed and simplified since, receiving its classic exposition in Raymond Smullyan (1968). Recently Richard Jeffrey has used a simplified form of it in his marvellous Formal Logic: Its Scope and Limits (1994). The present book is much more elementary than Smullyan’s, while I have attempted to introduce more standard material about first-order logic than Jeffrey does (including, in Chapter 3, an exposure to the crucially important role in many metatheorems played by induction), and less of the theory of computation. I have also added some ‘philosophical’ discussions, of truth and the Liar Paradox, categoricity, second-order logic, modal logic and conditionals. This book can be used in various ways. Chapters 1,2, 4 (the unstarred sections), 5 and 6 are material for an introductory course, while 1–9 would make up a comprehensive one-year course in first-order logic, and are suitable for students at both undergraduate and postgraduate level with no mathematical background who want to be able to

understand the mesh of syntactic and semantic arguments that makes up modern formal logic. Chapter 11 attempts to give some idea of the depth and significance of the classic results of modern (meta)logic, while Chapter 12 outlines some of the ways in which firstorder logic has been extended, and some of the principal objections brought against the representation of conditionals in first-order logic. These chapters could be used as a supplementary text in philosophy of logic, with the earlier material used as a source of reference for the main technical results of first-order logic. Proofs of soundness and completeness theorems for truth-functional and full first-order truth trees are given in Chapters 4 and 8. In Chapter 10 there is a discussion of examples of two of the main alternative proof systems, Hilbertstyle and Natural Deduction. Various sections of the first eight chapters are starred, to indicate that the material is not so elementary there, and starred exercises indicate a greater level of difficulty. I have tried to fulfil three principal aims: (i) to give a complete and clear account of the truth tree system for first-order logic, and of the important metatheorems associated with it; (ii) to show why logic is an exciting and flourishing discipline; and (iii) to show that the sorts of formal techniques exploited in proving even some ‘deep’ metalogical results are within the grasp of even determinedly non-mathematical students; for example, the various soundness and completeness proofs are not intrinsically difficult, and are certainly within the capacity of the non-specialist in logic to work through and understand. There are frequent failures in the book to achieve, and sometimes to approach, the highest standards of rigour, which I hope can be pardoned as sacrifices to clarity. There is no shortage of very rigorous texts for those who want them. One particular lapse is a more or less systematic failure to make explicit the ‘use-mention’ distinction. Labouring that distinction with typographical devices of one sort or another both disfigures a text and makes it difficult to read. Usually context suffices to distinguish use from mention, and there are warnings where I believe that there is any danger of conflation. Those who believe that departure from the use-mention orthodoxy approaches mortal sin may be induced to take a more lenient view by noting the inclusion of a discussion of and running references to the object-language-metalanguage distinction, and a separate discussion of the Liar Paradox.

Note

1 Foreword to Garton 1990.

Part I Truth-functional logic

Chapter 1 The basics 1 DEDUCTIVELY VALID INFERENCE There is much more to logic than the question of what makes inferences deductively valid or invalid, but to most people that is what logic is all about, so that is where we shall begin. One of the most basic features of these inferences is that they seem to be composed of declarative sentences, that is to say sentences which make assertions capable of being true or false. ‘Boris Yeltsin became President of Russia in 1993’, ‘All hydrogen atoms have one proton in their nucleus’ and ‘Michelangelo painted the ceiling of the Sistine Chapel’ are declarative sentences, and (we believe) true ones at that. ‘Shut that door!’ and ‘Is Hanoi in Scotland?’ are not. Neither is Chomsky’s funny example ‘Colourless green ideas sleep furiously’, which has the grammatical form, but only the form, of a fact-stating sentence. So far, so good. The sentences composing an inference are its premises and conclusion, the latter usually signalled by the prefix ‘therefore’ (for which, for brevity, we shall often use the symbol ∴). If the inference is deductively valid the conclusion is called a deductive or logical consequence, or simply consequence, of the premises. What else do we know? Well, one of the most familiar facts about deductively valid inferences, and the one which probably goes farthest towards explaining the importance they have always been accorded, is that it is impossible for their conclusions to be false if their premises are true: if anything is basic to the notion of deduction, that surely is. Consider this example, known as a disjunctive syllogism: It’s raining or it’s snowing. It’s not raining. ∴ It’s snowing. Clearly, you don’t have to know whether it’s actually raining or not, or snowing or not, to know that if the premises are true, so too is the conclusion. Not only is the conclusion true if the premises are. The conclusion must be true if the premises are true; there is no possibility of its being false. Not only is this the most important property of deductively valid inferences; it is difficult to think of any other that has that same generality. This being so, we might as well take it as the defining property, and accordingly frame the following Provisional definition: a valid deductive inference is one whose premises cannot all be true and conclusion false. The definition is provisional because the word ‘cannot’ itself rather obviously needs a

Logic with trees

4

definition, and providing an adequate one is not trivial: most of this book will be occupied in the task. But one thing we do know is that ‘cannot’ here has nothing at all to do with empirical fact, as it does in the statement that water cannot unaided run uphill. ‘Cannot’ in this context refers to a logical impossibility. It is a logical, not merely a physical, impossibility that ‘It’s snowing’ is false if both ‘It’s either raining or it’s snowing’ and ‘It’s not raining’ are true (assuming sameness of spatio-temporal reference in premises and conclusion). Here are two more examples to consider. If Lev is in Moscow then Irina is in Kiev. Lev is in Moscow. ∴ Irina is in Kiev. Cain was hairy and Abel was his victim. ∴ Cain was hairy. It is intuitively clear that these remain deductively valid, in the sense of the definition above, whatever sentences are substituted for ‘Cain was hairy’, ‘Abel was his victim’, ‘Lev is in Moscow’, ‘Irina is in Kiev’ and, in the disjunctive syllogism, ‘It’s raining’ and ‘it’s snowing’. Another way of putting it is to say that if we replace these sentences by the letters A, B, C, D, E and F, the respective formal representations (or formalisations) of these inferences E or F not E ∴F If A then B A ∴B C and D ∴C will always generate deductively valid inferences when the letters A, B, C, D, E and F are replaced by any sentences. An explanation of why this is so will plausibly rest on an analysis of the logical role played by the particles ‘and’, ‘or’, ‘not’, and ‘if… then—’. Now, a common method of analysing some phenomenon is to construct a model of it and see whether the model behaves in a way sufficiently resembling what it is supposed to model. This will be our procedure. The model, which will be presented in a systematic form in Chapter 3, is called a propositional language. ‘And’, ‘or’, ‘not’, etc. are basic syntactical items of these languages, and in the following sections we shall describe the way they are used to form compound truth-functional sentences, and the rules which determine how truth and falsity should be ascribed to these. (The syntax of a language is the set of rules which determine its formal structure, that is to say the way its basic vocabulary is organised into well-formed expressions, among which are the sentences of the language; the rules by which the sentences are equipped with truthconditions constitute the language’s semantics.)

The basics

5

2 SYNTAX: CONNECTIVES AND THE PRINCIPLE OF COMPOSITION ‘And’, ‘or’, ‘if…then—’ are structural items, called connectives by logicians, which articulate sentences into further sentences. ‘Cain was hairy’ and ‘Abel was his victim’ are said to be conjoined by ‘and’ to yield the conjunction ‘Cain was hairy and Abel was his victim’; the two sentences forming the conjunction are its conjuncts. ‘Not’ operates on the sentence ‘It’s raining’ to generate its negation ‘It’s not raining.’ ‘It’s raining’ and ‘it’s snowing’ are disjoined by ‘or’ to form the disjunction ‘It’s raining or it’s snowing’ of those two sentences, which are called the disjuncts. The sentences ‘Lev is in Moscow’ and ‘Irina is in Kiev’ are combined into the conditional sentence ‘If Lev is in Moscow then Irina is in Kiev’; ‘Lev is in Moscow’ is the antecedent, and ‘Irina is in Kiev’ is the consequent. These connectives play such a fundamental role that they have been given special symbols by logicians. The following are now standard: Connective

Symbol

not

¬

and

∧

or

∨

if…then

→

Because they operate on pairs of sentences to generate other sentences, ∧, ∨ and → are binary connectives; ¬ is unary, because it operates on single sentences. In what follows we shall refer, as just now, to ∧, ∨, and → directly as connectives rather than as connective symbols. In our model the basic items out of which its sentences are built are these connectives and a stock of capital letters A, B, C, D, etc., called sentence letters, from the beginning of the Roman alphabet. These are intended to represent some given set of English sentences with whose internal structure we are not concerned. The sentence letters are often called the atomic sentences of the model, because all its other sentences are compounded from them, using the connectives. The first level of composition consists of the negations, disjunctions, conjunctions of sentence letters, and conditionals formed from them. Each of these compounds is represented as follows: the negation of A by ¬A (¬ is prefixed to A, in contrast to the way ‘not’ is ordinarily embedded within a sentence to form its negation); the conjunction of A and B by A∧B; the disjunction of A and B by A∨B; and the conditional with antecedent A and consequent B by A→B. In a natural language there is no theoretical limit, though there are obviously practical ones, to the extent that sentences can be successively compounded together by means of connectives. Such a principle of composition will also operate in our model, to allow the sentences to be compounded ad infinitum using ¬, ∧, ∨ and →, generating symbol-strings like A→¬B, ¬(A→B), ((A→B)∨(B∧A))→C, etc. The brackets indicate which component

Logic with trees

6

sentences in each compound the various connectives connect. Denoting arbitrary sentences in the model by the letters X, Y, Z, etc., we can give a compact statement of the principle of composition in which the bracketing is automatically taken care of. The statement has two clauses, one unconditional, the other conditional: A, B, C, etc. are sentences, and if X and Y are sentences, then so also are ¬X, ¬Y, (X∨Y), (X∧Y), (X→Y) (in informal discussion the outer brackets will generally be dropped). In Chapter 3 we shall see that these two clauses determine for each sentence in the model a unique ancestral tree. 3 SEMANTICS: TRUTH-FUNCTIONALITY There are two other important elements in our model, the truth-values ‘true’ and ‘false’, which will be represented by the letters T and F. We shall make the important assumption that the truth-values of compound sentences depend on the truth-values of their component sentences; in particular, it will be assumed that the truth-values of ¬X, X∧Y, X∨Y and X→Y depend on those of X and Y and only on those. Call this assumption that of the truth-functionality of the connectives. Mostly, this assumption works quite well, though there are apparent exceptions which we shall investigate at length later in the book. It has the following consequence, on which the whole of truth-functional logic is based. Consider X∧Y. Its truth-value depends on those of X and Y; the truth-value of each of these, if it is not a sentence letter, depends on those of the sentences out of which it is immediately compounded; and so on backwards in the same way until we arrive at the sentence letters which are not compounded out of anything. In other words, the truth-value of any compound X in the model depends only on the truth-values of the sentence letters appearing in it. This consequence will be called the truth table principle, for the following reason. Let X be any compound. Suppose we arrange all the finitely many sentence letters, say A, B, C, etc., which appear in X, in a row, and write underneath all the possible distributions of truth-values to these in rows underneath A, B, C, etc. We then write X to the right of all the A, B, C, etc, giving a diagram that looks like this:

The truth-functionality assumption implies, as we have just seen, that each row of truthvalues on the left of the diagram determines a unique truth-value for X, which we shall write beneath X opposite the relevant row of truth-values on the left. The resulting table is called the truth table for X. The assignment of truth-values to sentence letters themselves is what model-builders call exogenous, determined outside the model. Our concern here is only with how those

The basics

7

truth-values, whatever they might be, determine the column of Ts and Fs beneath X in its truth table. The truth table principle tells us that this problem is solved once we have determined the truth tables for ¬ A, A∧B, A∨B, A→B. In the next section we shall make a start by determining the truth tables for ¬A and A∧B. We end this section on a philosophical note. There is a long-standing debate about whether sentences are truly bearers of truth-values, or whether only propositions can be (the usual definition of a proposition is that it is what is expressed by a sentence). While there is considerable disagreement, however, about exactly what sorts of things propositions actually are, there is absolutely no doubt that in everyday life the bearers of truth-values are the sorts of structured linguistic items described above, variously called statements or sentences; logicians tend to call them sentences. At any rate, it is with modelling these things that logicians have concerned themselves. And so shall we. 4 NEGATION AND CONJUNCTION If someone says to you that it is not the case that so and so, and you take what they say to be true, then this means that you take ‘so and so’ to be false. And if you take what they say to be false, this is because you take ‘so and so’ to be true. Putting these observations together in the model, we represent ‘so and so’ by the sentence letter A and construct the following truth table for ¬A:

In words: ¬A is true when A is false, and false when A is true. Similarly, the truth table for A∧B, i.e. ‘A and B’, is obtained by the same method of identifying the conditions under which we believe a conjunction to be true and those under which we believe it to be false. If you agree that A∧B is true, you are agreeing that A and B are both true, while if you think that A∧B is false, it is because you think that at least one of A and B is false. This immediately gives the truth table for ∧:

In words: A∧B is true just in case A and B are both true; otherwise it is false.

Logic with trees

8

Important note The truth-functionality assumption says that the truth-value of a conjunction or negation depends only on the truth-values of the sentences conjoined or negated, not on those sentences themselves. This means that, though the truth tables above are written for sentence letters A and B, they are equally valid when A and B are replaced by X and Y, i.e. by arbitrary sentences of the model language, compound or atomic. We could make this explicit by writing, as some do, the truth tables for negation and conjunction like this:

However, we shall continue to use the earlier tables, because their format is straightforwardly extended to the evaluation of any compound sentence, however complex. Exercises 1 If ¬A is true, what is the truth-value of A? 2 If A∧B is false and B is true, what is the truth-value of A? If you know merely that A∧B is false, does that tell you anything about the truth-value of A?

5 DISJUNCTION It is often claimed that there are two types of disjunction, or use of the word ‘or’, in English and other natural languages, inclusive and exclusive. To assert an exclusive disjunction (i.e. to claim implicitly that it is true) is to assert that one or other disjunct is true, but not both, while to assert an inclusive disjunction is to assert that at least one is true, and possibly both. There is certainly an inclusive use of ‘or’ in English; examples abound (here is one: ‘If you’re old or disabled nobody bothers with you’; we would all take the ‘old or disabled’ here to include any who are both old and disabled). By contrast, it is actually quite difficult to find a genuine use of exclusive ‘or’ which is not exclusive simply because the disjuncts are themselves exclusive, for example ‘He got either ten or twenty years; I can’t remember which.’ At any rate, logicians regard the inclusive ‘or’ as primary, and ∨ is accordingly given the truth table

In words:

The basics

9

A∨B is false only when A and B are both false, and otherwise true. Nothing is lost in apparently ignoring the exclusive disjunction, because as we shall see in the next section, it is already implicit in the connectives ¬, ∧ and ∨. Warning Words are notoriously not always what they seem. Consider the statement ‘You may have tea or you may have coffee’, which is not, as it appears to be, a disjunction but a conjunction: it actually says that you may have tea and you may have coffee (though it does not mean that you may have both). Exercises If A∨B is true and A is false, what is the truth-value of B?

6 TRUTH-FUNCTIONAL EQUIVALENCE We can use the truth table principle to evaluate arbitrary truth-functional compounds built up from sentence letters using connectives from the list ¬, ∧, ∨ and →. Consider, for example, the compound (A∨B)∧¬(A∧B). We first evaluate the inner conjunction A∧B against each row of the truth table, then the negation ¬(A∧B), and finally the conjunction (A∨B)∧¬(A∧B), as below. The truth-values of this final conjunction are listed in bold type in the central column of the truth table:

Not a very exciting compound, one might think. However, suppose we introduce a new binary connective xor (exclusive ‘or’; i.e. exclusive disjunction), whose truth table is

Inspection of the truth table for (A∨B)∧¬(A∧B) now reveals that it depends on the truthvalues of A and B in exactly the same way as does the truth-value of A xor B: for each row of the truth table the two compounds take the same truth-values. This is interesting for two reasons. First, it verifies the claim that exclusive ‘or’ is implicit in the connectives ¬, ∧ and ∨. So we do not need a special symbol like xor for exclusive disjunction: we could simply define the exclusive disjunction of A and B to be the

Logic with trees

10

compound (A∨B)∧¬(A∧B). Second, we have a new concept: truth-functional equivalence. A pair X, Y of compounds are said to be truth-functionally equivalent if, like A xor B and (A∨B)∧¬ (A∧¬B), X and Y take the same value at each row of the truth table generated by listing all distributions of truth-values over the set of all the sentence letters that appear in each compound. This set of sentence letters may be the same for both compounds, as it is above, but it may not. For example, A and A∧(B∨¬B) do not have the same set of sentence letters, but for all rows of the truth table generated by the four distributions of T and F over A and B, A and A∧(B∨¬B) take the same truth-values, and are therefore truthfunctionally equivalent. We shall use the notation X ⇔ Y to signify that X and Y are truth-functionally equivalent. (Note that ⇔ is not itself a connective.) As we shall see in the following chapters, the notions of truth-functional equivalence and its extension, firstorder equivalence, will turn out to be of fundamental importance. As an exercise in the truth table evaluation of compound sentences, we shall end this section by showing that (A→C)∧(B→C) and (A∨B)→C are truth-functionally equivalent:

Note There is no logical significance to the order in which the eight truth-value distributions over A, B and C are listed, though it is a good practical rule, as above, to start with all Ts, then all the ways (three) two Ts can be combined with one F, then all the ways (three) one T can be combined with two Fs, and then finish with all Fs. If a compound is built up from n distinct sentence letters, its truth table will have 2n rows, since there are two ways of assigning T or F to the first letter, and for each of these there will be two ways of assigning T or F to the second, and for each of these there will be two ways of assigning T or F to the third, and so on, giving 2.2.2. …, n times, which is equal to 2n.

The basics

11

Exercises 1 Construct truth tables for the following compounds: (i) (B∨C)∧(C∨B) (ii) ¬(A∧¬C)∨B (iii) ¬A∧(¬CvB) 2 Construct truth tables to show that (i) A∧(B∧C)(A⇔B)∧C and A∨(B∨C)⇔(A∨B)∨C. This property of A and ∨ is called associativity. (ii) A∧(B∨C)⇔(A∧B)∨(A∧C) and A∨(B∧C)⇔(A∨B)∧(A∨C). These are the so-called distributivity laws.

7 THE CONDITIONAL It is time to look at the final connective in the list of connectives drawn up in section 2, the conditional or arrow →, intended to symbolise the English ‘if…then—’. Imagine that you are listening to an old-fashioned melodrama, and at one point one of the protagonists exclaims ‘You will not reveal all, or I am lost!’ The substance of this assertion could equally well be conveyed, albeit more prosaically, by the conditional ‘If you reveal all then I am lost.’ This suggests that whatever English sentences A and B might represent, ¬A∨B and A→B should be merely different formulations of the same information, and hence be truth-functionally equivalent. Supposing this to be the case, the truth table for A→B is

because that, as the reader should check, is the truth table for ¬A∨B. In words: A→B is false when A is true and B false, and true in all other cases. (Readers who have some familiarity with logic programming will probably be more accustomed to seeing A→B written B←A, i.e. ‘B if A’). However , there seem to be other types of conditional statement in everyday life that are not expressed by →. One in particular seems to demand a definitely non-truth-functional analysis, and this is where the antecedent is counterfactual. For example, consider the sentence ‘If I had struck the match at that particular moment [t, say], a genie would have appeared’, where you did not in fact strike the match at that moment. Most people would regard this sentence as false, but if it is expressed as ‘I strike the match at moment t → a

Logic with trees

12

genie appears’, and evaluated by means of the truth table for →, then, given that the antecedent is false (counterfactual), the truth table for → makes the sentence true! This goes strongly against intuition, and to make matters worse, ‘If I had struck the match at that particular moment a genie would not have appeared’ also comes out true on the truth-functional reading using →, because the antecedent remains false in this sentence too. We shall postpone further discussion of counterfactuals to Chapter 12, where we shall also look at some challenges to the truth-functional reading of some non-counterfactual conditionals. Exercises 1 Construct truth tables to show that the following truth-functional equivalences hold:

2 Let å(A, B) be false only when A is false and B true. How would one express å(A, B) using only the connective →? 3 Verify that A→B has the same truth table as ¬A∨B and ¬(A∧¬B).

8 SOME OTHER CONNECTIVES, AND THE BICONDITIONAL Other connectives in common use in English, like ‘unless’, ‘but’ and ‘only if’, for example, can be more or less faithfully defined in terms of ¬, ∧, ∨ and →, and we shall deal with them in turn: But ‘I went to see the film but I didn’t like it’ says, from the point of view of simple truth and falsity and shorn of the nuance of regret, ‘I went to see the film and I didn’t like it.’ Logic is concerned with the way the truth-values of sentences depend on each other, and not with one’s feelings about what the sentences describe; so from the purely logical point of view, ‘but’ is ‘and’. Unless ‘You will not reach 100 unless you first reach 99 (years of age)’ plausibly means the same as ‘If you do not first reach 99 you will not reach 100’; so we shall take ‘A unless B’ to mean the same as ‘If not B then A’, represented in the model by ¬B→A. Only if ‘You will reach the age of 100 only if you first reach the age of 99’ means the same as ‘If you don’t first reach the age of 99 you won’t reach the age of 100’; so we take ‘A only if B’ to mean the same as ¬B→¬A. Now look at the truth table for ¬B→¬A:

The basics

13

But this is the truth table for A→B, which means that we can render ‘A only if B’ directly by A→B. If A→B is true then A is often called a sufficient condition for B, and B a necessary condition for A. There is one further connective which is often distinguished by being given a special symbol, even though it too will turn out to be definable in terms of other connectives among ∧, ∨, ¬ and →. This is the so-called biconditional ‘if and only if’, and it will be symbolised by ↔. It has its own symbol because statements of the form ‘A if and only if B’ crop up very frequently. However, the biconditional could also be defined in terms of A and →. ‘A if B’ is clearly B→A, and we have just seen that ‘A only if B’ has the same truth table as A→B. Hence, A↔B, ‘A if and only if B’, is truth-functionally equivalent to (A→B)∧(B→A), from which it follows that its truth table is evaluated as

i.e.: A↔B is true just in case A and B have the same truth-value. Exercises 1 You are in a country whose inhabitants randomly tell the truth or lie. You are trying to reach the capital city, which you know lies on the road you are following, but to your dismay the road forks. The capital is on one of the forks, but there is no signpost. A native of the country appears. A law of the country is that the natives are allowed only to answer ‘yes’ or ‘no’ to questions. How they will do so will depend on whether they are telling the truth or lying, but which they will do on any given occasion of course you simply don’t know (but you do know that they are very good at logic). All seems hopeless until you remember that, long ago, you attended a logic course. Suddenly you realise that there is a slightly complicated question you can ask this person, such that their answer will tell you for certain which fork the capital lies on. What is the question? (There is more than one, but the following method will certainly generate one. Consider a question of the form ‘Is X(A, B) true?’, where X(A, B) is a truthfunctional compound of A and B which you are familiar with, A is the sentence ‘You are lying’, and B is ‘The capital lies on the left fork.’ You want the native’s answer to be ‘yes’ if and only if B is true, and working back from this will identify X(A, B)—or, similarly, you may want to correlate the native’s ‘yes’ answer with B’s falsity; either

Logic with trees

14

way, once the native has answered, you’ll know for sure whether the capital lies on the left fork.) 2 Which connective among ∧, ∨, ¬, → and ↔ would you use to represent ‘just in case’ in the statement ‘A∧B is true just in case A and B are both true’? 3 Display the truth-functional structure of the assertions in (i) and (ii) below using the sentence letters indicated and the appropriate connectives among ∧, ∨, ¬ and → (omit the ‘Therefore’ in each case). (i) If wage-settlements continue at this high level (A) or they increase (B), and nothing is done to take money out of the economy (C), then inflation will continue to rise (D) and we shall be in serious trouble (E). Therefore if nothing is done to take money out of the economy we shall be in serious trouble. (ii) Tracey won’t return Also sprach Zarathustra (A) unless Wayne gives her the £5 he owes her (B), but Wayne will not do this without Rudolf giving him some of the money (C). Amaryllis doesn’t want Rudolf to do this (D), and if Amaryllis doesn’t want Rudolf to then Rudolf won’t. Carlos will only be able to do his homework (E) if Tracey returns Also sprach Zarathustra. Therefore Carlos won’t be able to do his homework. 4 Which connective among ∧, ∨, ¬ and → would you use to represent ‘whilst’ in the sentence ‘Whilst I believe in law and order, the actions of the police sometimes make me unhappy’? 5 Give one way of expressing the truth-functional form of ‘Untidy or inaccurate work will cost you marks.’ (Be careful!) 6 Is the sentence ‘Jill and Siân are the opposing team’ a conjunction of two sentences? 7 ‘If you are over 18 and married, then your name will go on the register unless you have already received benefit.’ How would you formalise this statement using the propositional connectives already given?

Chapter 2 Truth trees 1 TRUTH-FUNCTIONALLY VALID INFERENCE We now have a slightly better vantage point from which to investigate the inferences with which we started in Chapter 1: If Lev is in Moscow then Irina is in Kiev. Lev is in Moscow. ∴ Irina is in Kiev. Cain was hairy and Abel was his victim. ∴ Cain was hairy. It’s raining or it’s snowing. It’s not raining. ∴ It’s snowing. These have the respective truth-functional representations (A and B will obviously represent different sentences in each):

In each of (i)–(iii), consider what happens if we try to assume that the premises could be true and the conclusion false: (i) If B were to be false and A true, then the truth table tells us that A→B would be false. Thus we could not, on pain of contradiction, have true premises and a false conclusion. Therefore the inference is truth-functionally valid. It is usually referred to by its classical Latin name: modus ponens. (ii) is immediate. If A∧B is true then, from the truth table for ∧, both A and B are true. Hence in particular A must be true. (iii) is the disjunctive syllogism. Suppose that B is false, and that ¬A and A∨B are both true. Then we have a contradiction, for A must be false, and we have assumed that B is false, so that A∨B must be false, contrary to assumption. (i), (ii) and (iii) are therefore deductively valid inferences. They are deductively valid, moreover, by virtue of their truth-functional structure alone, and inferences which are valid by virtue of their truth-functional structure alone are called truth-functionally valid. More precisely, A truth-functionally valid inference is one whose premises and conclusion can

Logic with trees

16

be represented as truth-functional compounds built up from some set of sentence letters, such that there is no distribution of truth-values over those sentence letters which makes the premises all true and the conclusion false. Just as the conclusion of a deductively valid inference is said to be a deductive consequence of the premises, so the conclusion of a truth-functionally valid inference is said to be a truth-functional consequence of the premises. If there is a distribution of truth-values making all the premises true and the conclusion false, then that distribution is called a truth-functional counterexample to the inference. Hence: An inference is truth-functionally valid just in case there is no truth-functional counterexample to it. A very important consequence of the definition of truth-functionally valid inference is that there is an algorithm for deciding whether any given inference is truth-functionally valid or not (an algorithm is a ‘mechanical’ procedure which decides all members of a given class of problems in finite time, and is now usually regarded as a program which can be run on a suitably powerful computer). If there are n sentence letters in the sentences making up the inference, we know that there are 2n truth-value distributions over those sentence letters, and a truth table will evaluate all the sentences for each of these distributions. Then all we have to do is see whether one or more of those 2n distributions make all the premises true and the conclusion false. However, for even quite moderate values of n, 2n is a biggish number (220 is about a million). That search procedure is therefore exponentially complex, and it turns out that it contains in addition a lot of redundancy. For we learn nothing about the validity of the inference from examining the truth-value distributions which make either the premises false or the conclusion true: the only relevant distributions when considering deductive validity are clearly just those which make the premises true or the conclusion false. This is where the tree diagrams for the connectives come in so useful, for as we shall see in the next sections, they can be used to systematically eliminate the uninformative paths in the search for counter-examples. Exercises 1 Show that if X and Y are any two truth-functional compounds, then X⇔Y if and only if X and Y have exactly the same truth-functional consequences. 2 Show that X⇔Y if and only if X is a truth-functional consequence of Y and Y is a truth-functional consequence of X.

2 CONJUGATE TREE DIAGRAMS There is another way of displaying the information given in the truth tables for ∧, ∨, → and ↔, which as we shall see shortly generates a powerful and elegant method of proving truth-functional validity. This is to represent the truth tables for binary connectives by

Truth trees

17

tree diagrams, or more exactly by conjugate pairs of tree diagrams. Since the truth-value of, say, a conjunction depends only on the truth-values of its conjuncts, and not on the conjuncts themselves, we can write the truth table for the conjunction X∧Y of two compounds X and Y in the same way as if they were sentence letters:

Now consider the following pair of diagrams:

For the time being, regard the boldface T’s and F’s as integral parts of each diagram (T and F are printed in bold type to distinguish them from the letters X and Y). Read upwards, the left-hand diagram can be interpreted as saying that when X and Y are both true (T), so is X∧Y, and the right-hand diagram can be interpreted as saying that if X is false (F), whatever the truth-value of Y, then so is X∧Y, and that if Y is false, whatever the truth-value of X, then so is X∧Y. So interpreted (and we shall see later that the diagrams can be read both upwards and downwards), the left-hand diagram represents the one row of the truth table for X∧Y in which X∧Y takes the value T, and the right-hand diagram represents the three rows for which X∧Y takes the value F. In other words, the pair of diagrams contains exactly the information contained in the truth table for X∧Y. The diagrams are called signed tree diagrams because they are tagged (or signed) with truth-values. We can eliminate these tags in the following way. Noting that when any sentence Z is false its negation ¬Z is true, we can rewrite the second diagram as

Now both the left- and the right-hand diagrams are signed only with Ts, and that being the case we can regard the Ts as understood and omit explicit mention of them. So we are left with a pair of unsigned diagrams, the unsigned conjugate tree diagrams for ∧:

Logic with trees

18

We could equally accurately have represented the F rows of the truth table for X∧Y by an unsigned diagram with three branches:

where each branch represents one row of the truth table for which X∧Y is F. However, the function of conjugate diagrams is not simply, or even primarily, to represent truth tables; their principal function is to provide rules of inference, for what will be called tree proofs, and for them to perform this role efficiently the number of branchings from any sentence is best limited to at most two. We shall see very shortly how these diagrams can then be put together to form elegant and virtually mechanical proofs. The left-hand diagram in a conjugate pair will always represent the T rows of the truth table for the relevant connective, and the right-hand diagram the F rows. With this in mind we turn to the truth table for ∨:

Reading from the table, we see that if X is true then X∨Y is true, whatever the truth-value of Y, and if Y is true then X∨Y is true, whatever the truth-value of X. This gives us the left-hand (signed) diagram for ∨:

There is just one row where X∨Y takes the value F, and that is the row at which both X and Y take the value F. This gives us the right-hand signed diagram for ∨:

Truth trees

19

Using the same unsigning procedure as for ∧, we obtain the pair of unsigned conjugate diagrams for ∨:

Notice the mirror symmetry between the pair of diagrams for ∧ and that for ∨: if we change ∨ to ∧ in the diagrams for ∨, and negate each sentence, eliminating all double ¬’s, we obtain the diagrams for ∧; and vice versa. This is a special case of a principle called the Duality Principle, which we shall return to in Chapter 3. The same procedure as that used to obtain the diagrams for ∧ and ∨ yields the conjugate unsigned diagrams for → and ↔:

Where the lower sentences in these diagrams occur in pairs it is customary to write one member of the pair above the other, as above. No priority is implied in this listing, and it can be reversed without changing the diagram characteristics, as we shall see. The reader may be wondering what has happened to ¬. It too is a truth-functional connective, defined by a truth table, and consequently it too will have a pair of unsigned diagrams representing its truth table. Since its properties have already been implicitly used in converting signed diagrams into unsigned ones, it might seem that the unsigned diagrams for negation itself are unlikely to be very informative. To some extent this is true, but by no means entirely. The signed diagrams are these:

Logic with trees

20

yielding the unsigned diagrams:

Clearly, the left-hand diagram of this pair is totally uninformative. This leaves us with only one non-trivial diagram, the right-hand one. Though a single diagram, it will turn out to be very useful, so useful that it is given a special name, the Rule of Double Negation. Exercises 1 Construct the truth tables for the sentence Z represented in (a) and (b) below by the left-hand unsigned member of each of a pair of unsigned conjugate diagrams, and express Z as a truth-functional compound of X and Y, employing any of the connectives ∧, ∨, ¬ and → (remember that the left-hand diagram always represents the T rows of the table for Z).

(a)

(b) 2 Construct the truth table for the compound which these conjugate diagrams determine, and rewrite the right-hand diagram so that it satisfies the condition that no diagram has more than two branches:

3 TRUTH TREES The conjugate diagrams of the previous section (together with the Rule of Double Negation) can be used to yield simple graphic demonstrations of truth-functional validity. We shall start with some very simple examples, the inferences (i)–(iii) of section 1. First (i), modus ponens:

Truth trees

21

Asking whether it is possible for the premises A→B, A to take the truth-value T and the conclusion, B, the value F is, of course, equivalent to asking whether A→B, A, and ¬B can all take the value T. Write down A→B, A, ¬B, signifying that we are supposing, for the sake of argument, that they’re all true:

These will be called the initial sentences. The order in which they are listed is immaterial: they’re simply a set of three sentences assumed, for the moment, to be all true. Now write underneath the initial sentences the lower part of the tree diagram for A→B:

We now have a small upside-down tree:

If we define a branch in the tree to be a continuous path up from the terminal lower sentences to the topmost sentence of the tree, then the tree above has two branches, one carrying the sentences A→B, A, ¬B, ¬A, and the other A→B, A, ¬B, B. This tree is called a truth tree. The information it contains is probably more immediately conveyed by the equivalent signed version:

On the left-hand branch we see AT and AF, and on the right-hand branch BT and BF; neither is a consistent assignment of truth-values. As those two branches represent all the possibilities admitted by the initial assignment, this signed tree shows that the assumption

Logic with trees

22

that A→B and A are true and B false is impossible: it leads to a contradictory assignment of T and F either to B or to A. The original tree also showed this, if less explicitly, by having both A and ¬A on one branch, and B and ¬B on the other. At any rate, we infer that there can be no truth-functional counterexample to the inference; i.e. no distribution of truth-values to A and B which makes the premises A→B and A both true and the conclusion B false. In other words, the inference A, A→B ∴B is truth-functionally valid. Of course, we already knew that, by means of an argument in section 1 that in some ways resembles an informal version of the one above. But the tree format has the advantage over the informal argument that it extends to a simple and mechanical method for deciding the validity or otherwise of any truth-functional inference whatever, no matter how complex. Now for (ii):

Here the sole premise is A∧B, and the conclusion is A, and we shall again use a tree— they will all be unsigned from now on—to show that the assignment of T to A∧B and F to A, i.e. of T to both A∧B and ¬A, is impossible. This time we get a tree with only one branch

on which the pair A, ¬A appear. These cannot both be true, so again we conclude that our assumption, that a truth-value distribution over A, B exists which satisfies both A∧B and ¬A, leads to a contradiction. Hence the inference from A∧B to A is truth-functionally valid. Finally, (iii). Again, we write down the premises and negation of the conclusion, implicitly assuming them all true:

We see immediately that we have A and ¬A on one branch, and B and ¬B on the other. Since these branches exhaust the possible ways in which the initial sentences can be true together, we infer that those initial sentences cannot all be true together, and so (iii) is

Truth trees

23

truth-functionally valid. This all seems very promising. Let us see how it fares with a slightly more complex inference: If the mark rises or the yen rises the dollar will fall. The dollar will not fall. ∴ It is not the case that the mark rises and the yen rises. This has the truth-functional form

As in the previous examples, we list the premises and the negated conclusion, so:

and, starting from this set of initial sentences, we shall use the conjugate diagrams for the connectives to decompose the compound sentences into simpler ones until the tree generated in the process finally tells us, by depositing a sentence and its negation on every branch, that there is no way all the initial sentences can be true. The final sentence on the list of initial sentences is the double negation ¬¬(A∧B), which seems a good enough place to start. Applying the rule of double negation to ¬¬ (A∧B), we extend a branch downwards as follows, numbering the lines as we go:

In the terminology we shall use from now on, ¬¬(A∧B) has been used. For book-keeping purposes we can indicate this by placing a tick beside it—we don’t want inadvertently to use it twice. Proceeding downwards, we can now write the diagram for A∧B directly beneath A∧B:

Logic with trees

24

Now what? We now look around for another compound sentence to use by writing the tree diagram for it under the lowest point of the tree. There is only one such sentence, the topmost, the conditional (A∨B)→C. This has antecedent A∨B and consequent C. Directly beneath B on line 6 write the diagram for the conditional with that antecedent and consequent:

We have now used (A∨B)→C, and in so doing extended the tree to one with two branches. The right-hand branch, terminating in C, contains the negation ¬C of C as well as C, and therefore (visualise the tree signed with truth-values) represents an impossible truth-value assignment. Accordingly, we shall close that branch by writing a line beneath it to signify that it must not be continued. As soon as any two sentences occur on a branch, one of which is the negation of the other, that branch is closed. No further attention is paid to closed branches; they merely represent failed attempts to make the initial sentences all true. The remaining unclosed branches on any tree are said to be open. There is now one open branch on the tree above, the left-hand one. It also contains a compound sentence for which there is a tree diagram, namely ¬(A∨B). Accordingly we continue the left-hand branch by writing the tree diagram for ¬(A∨B) beneath ¬(A∨B):

Truth trees

25

Now all the branches have closed. When this happens the tree itself is said to close, a phenomenon we have now learned to interpret as meaning that no distribution of truthvalues at all over A, B and C will make the initial sentences jointly true. Therefore the inference we started with is truth-functionally valid: no distribution of truth-values over its sentence letters makes the premises true and the conclusion false. It makes little difference in what order we use the sentences on a tree. In particular, it makes no difference to whether the tree closes or not (we shall give a rigorous proof of this in Chapter 4). We can check that this is so in the example above by constructing a tree in which the sentence (A∨B)→C is used first:

Important note In the event of there being more than one open branch at the point at which a sentence is used, place the tree diagram for that sentence at the end of every such branch. Bearing in mind that the tree diagrams give the truth-conditions for their respective connectives, it’s not difficult to see why this should be done: each open branch of the tree represents a different way in which the sentences on it which have already

Logic with trees

26

been used can be jointly true. Each time a further sentence is used, that represents a further constraint, and one which has to be applied uniformly to all open continuations of that branch. Hence when a sentence is used in a tree, its diagram must be placed on every open branch. We have observed that we can interpret a closed tree as indicating that there is no distribution of truth-values over its sentence letters which will make the initial sentences all true. What if we eventually use all the usable sentences on the tree and it doesn’t close? The tree in these circumstances is said to be finished and open. Can we infer that there is a truth-value distribution over the sentences letters that will make all the initial sentences true? The answer is ‘yes’. Informally, the argument is as follows: each open branch of the tree represents a way in which the sentences on it which have been used can be true. If there is no further sentence to be used and there are still open branches, this means that there are ways in which all the sentences on those branches, including the initial sentences, can all be true. To sum up: Truth trees constructed from a set of initial sentences tell us whether there is a truth-value distribution over their sentence letters which will make the initial sentences true. If there is no such distribution, the tree will close. If there is one. it won’t. So far we have admittedly given only rather intuitive arguments for these claims, especially the last, but eventually we shall be in a position to give a more rigorous one. We shall end this section with a resumé of how to build trees. First, write the initial sentences (when testing inferences, these are the premises and the negation of the conclusion) in any order. Then select what looks like the simplest compound sentence and write its diagram under the initial sentences. That sentence is now used. Now choose another compound and write its diagram under the last sentence on each open branch if there is more than one. Continue doing this, closing every branch on which appears a sentence and its negation, until the tree itself either closes or else terminates without closing. Procedural remark Neither numbering the lines of a tree nor ticking branches is an essential part of tree construction; for extended trees these are useful devices, for simple ones usually unnecessary. Historical note Despite the fact that tree proofs of validity are a relatively modern development, they are really just a way of representing what logicians have traditionally called reductio ad absurdum arguments. In these you assume as true the premises and also the negation of the conclusion which is alleged follows from them, and you try to deduce a contradiction (‘reduce them to absurdity’). If you succeed, that shows that the negation of the conclusion is inconsistent with the premises. The tree construction is a powerful modern systematisation of this ancient method of proof. Exercises 1 Formalise the following inferences using sentence letters and the appropriate

Truth trees

27

connectives, and construct truth trees to show that each is truth-functionally valid. (i) The butler did it or the gardener did it. The gardener did not do it. ∴ The butler did it. (ii) The butler did it or the gardener did it. ∴ If the butler did not do it then the gardener did. (iii) If the butler did not do it then the gardener did. ∴ The butler did it or the gardener did it. (iv) If the butler did it then the gardener did not. ∴ If the gardener did it then the butler did not. (v) ‘If naive realism is true then naive realism is false [not true]. ∴ Naive realism is false’ (Bertrand Russell). (vi) If the government raises interest rates then there will be inflation. If the government does not raise interest rates then there will be inflation. ∴ There will be inflation. 2 Suppose you have constructed a tree to test the validity of an inference, and all the branches close with one of the initial sentences still unused. What does this tell you (i) if the unused sentence is a premise, and (ii) if the unused sentence is the negated conclusion?

4 TAUTOLOGIES AND CONTRADICTIONS Consider the compound A∨¬A. This takes the truth-value T whatever the truth-value of A might be. So does ¬(A∧¬A), while A→(B→A) and (A∧¬A)→B take the value T whatever the truth-values of each of A and B might be. Compounds like these which take the value T for all distributions of truth-values over their sentence letters are called tautologies. Compounds which take the value F for all values of their sentence letters are called contradictions. We can immediately infer that the negation of a tautology is a contradiction, and the negation of a contradiction is a tautology. We can use truth trees to test for tautologousness and for contradictoriness. No truthvalue distribution over sentence letters can make a contradiction true, which means that if we construct a truth tree from it then eventually the tree will close. Conversely, if the tree closes then there is no truth-value distribution over sentences letters that makes the initial sentence true, and it is therefore a contradiction. Example

Logic with trees

28

i.e. ¬(A∧B)→¬A is a contradiction. We can now also construct a tree test for tautologousness. For if a sentence is a tautology then its negation is a contradiction, which means that if we construct a tree from its negation the tree will close. In other words, X is a tautology if and only if a tree generated from ¬X closes. Beginners are always tempted to say that a compound is a tautology just in case a tree generated from it has only open branches. This is definitely incorrect, though it is left as an exercise to say why. Example (A→(B→C))→((A→B)→(A→C)). This is a tautology, as the following closed tree establishes (supply the justification for each line):

Finally, we can test for whether a compound is neither a tautology nor a contradiction. Construct a tree from ¬X, and another from X. Neither tree closes if and only if X is neither a tautology nor a contradiction. A tautology of the form X↔Y is called a tautological biconditional. There is an intimate relationship between tautological biconditionality and truth-functional equivalence: where X and Y are any truth-functional compounds, X⇔Y if and only if the compound (X↔Y) is a tautology (this is easy to show, and is left as exercise 1 below). This means that we can use a tree test for truth-functional equivalence, for we know how to use trees to test for tautologousness. Thus we can decide whether X⇔Y by seeing whether a tree generated from ¬(X↔Y) closes; if it does, X⇔Y. We can even shorten the procedure as follows. The tree diagram for a negated biconditional is this:

Truth trees

29

A tree generated from ¬(X↔Y) will therefore close if and only if the two trees generated by the pairs {X, ¬Y} and {Y, ¬X} of initial sentences both close. Hence X⇔Y if and only if trees generated from both pairs {X, ¬Y} and {Y, ¬X} of initial sentences close. Note that this result also tells us that X⇔Y if and only if X is a truth-functional consequence of Y and Y is a truth-functional consequence of X. Exercises 1 Let X and Y be truth-functional compounds. Explain why X⇔Y if and only if X↔Y is a tautology. 2 Show that any sentence is a truth-functional consequence of a contradiction, and that a tautology is a truth-functional consequence of any sentence. 3 Of the following, state which are tautologies, contradictions or neither, and construct trees to justify your statements. (i) A→(B→A) (ii) A→(¬A→B) (iii) A∧¬(A∨B) (iv) (A∨B)→B (v) (A∧B)→A (vi) A→¬A (vii) ¬(A→A) (viii) ¬(A→A) (ix) A∧¬(B→A) 4 Suppose a tree generated by a single sentence X has only open branches. Does this mean that X is a tautology? 5 Of the following pairs of sentences, state which are truth-functionally equivalent to each other, and construct trees to justify your statements. (a)

A→(B→C)

(A∧B)→C

(b)

A→B

¬A→¬B

(c)

(C∨A)∧(B∨A)

(C∧B)∨A

(d)

A∧C

¬(A→¬C)

(e)

(A→B)∧(C→B)

(A∨C)→B

(f)

(A→B)∧(A→C)

A→(B∨C)

(g)

(A∧¬A)∨(B∨C)

B∨C

(h)

A∨(B→D)

¬B∨(A∨D)

(i)

¬(A↔B)

(A∧¬B)∨(B∧¬A)

(j)

A∧¬A

B∧¬B

6 In what follows let T be an arbitrary tautology and ⊥ an arbitrary contradiction, as above. Let X be any truth-functional compound. With reference to appropriate trees,

Logic with trees

explain why

30

Chapter 3 Propositional languages 1 PROPOSITIONAL LANGUAGES In Chapter 1 we introduced the notion of a propositional language, as a formal model of the class of truth-functional compounds which can be generated from some initially given set of sentences, using some specified set of connectives. These formal languages are of interest for two reasons: (i) they form the framework for the detailed development of truth trees in Chapter 4; and (ii) a characteristic feature of their syntactic structure allows a powerful method, called inductive proof, to be used to investigate their properties. Nearly all the results of this chapter will be directly or indirectly obtained by this method. However, if the reader wants to continue the development of truth trees where the last chapter left off, they can skip this chapter and proceed directly to Chapter 4, where enough about propositional languages to make the discussion intelligible will be explained at the outset. These preliminaries over, let S be a set of sentence letters and Π some set of connectives. Let L[S; Π] denote the set of all the truth-functional compounds which can be constructed using sentence letters from S and connectives from Π. L[S, Π] is called the propositional language generated by the sentence letters in S and the connectives in Π. We shall often refer to an arbitrary propositional language simply as L. Following the notational convention introduced in Chapter 1 we shall use capitals…, X, Y, Z from the end of the Roman alphabet to refer to arbitrary sentences of L. When the members of either S or Π are explicitly displayed, as they are in L[{A, B};{¬, ∧}], for example, we shall omit the set brackets {, } and simply write L[A, B; ∧, ¬}. S can be either finite or infinite, but Π will be assumed to be finite; it does not have to be the set {∧, ∨, ¬, →, ↔}, or even include any members of it. If we want to define a new five-place connective © where ©(A, B, C, D, E) is true, say, when A and D are true, and false in all other 25 −23=24 cases, then there is a perfectly respectable propositional language whose set of connectives contains ©. For all choices of Π except the empty set the number of distinct sentences in L[S; Π] is infinite, even when S contains just one letter. For example, suppose S is just {A} and Π is just {¬}. Then all of ¬A, ¬¬A, ¬¬¬A, ¬¬¬¬A, etc. are in L. The reader should keep in mind that though A and ¬¬A are truthfunctionally equivalent, they are nevertheless different sentences, A has no occurences of ¬ while ¬¬A has two. Exercises 1 How many sentences are there in (i) L[A, B; Ø] (Ø is the standard symbol for the empty set; in other words, there are no connectives in this language), (ii) L[A; ¬], and (iii) L[A; →]?

Logic with trees

32

2 Is A↔B a sentence in L[A, B; ∧, ∨, ¬, →]? If not, does this mean that there is no biconditional sentence in L? 3 Show that there are infinitely many sentences in L[A, B; ∨, ¬] truth-functionally equivalent to A→B.

2 OBJECT-LANGUAGE AND METALANGUAGE Throughout this book we shall be using more or less standard English to discuss the structure of the formally generated entities intended to be the symbolic representations of English (or any natural-language) sentences themselves. The language in which these structures and their relationships are discussed—in this case English—is called the metalanguage. The formal languages, like the propositional languages we have just introduced, whose structure is under discussion in the metalanguage, are usually called object-languages. Where, as here, a special symbolism is used for the object-language, the meta-/objectlevel distinction is fairly explicit. However, sometimes special symbols are used also in the metalanguage. For example, ⇔ is used in this book as a metalinguistic symbol, denoting a relation between objectlanguage sentences. The letters …, X, Y, Z used to refer to ‘arbitrary’ sentences in a propositional language L are also metalinguistic objects; their correct classification is metalinguistic variables, ranging over a domain consisting of all the sentences of L. The letter L itself is also a metalinguistic variable, ranging over a domain consisting of propositional languages. In saying that English is used as a metalanguage for the discussion of a propositional object-language, we are making a meta-meta-level assertion, since we are now discussing the relationship of ordinary English itself to its object-language(s). Yet we are doing so in ordinary English! In other words, one and the same language—ordinary English in this case—is made to operate on more than one level. But this is true to a great extent even as regards the metalanguage of the object-languages we shall discuss in this book and those object-languages themselves. The reader will soon note, if they haven’t already, that in our metalinguistic discussion we are using terms like ‘and’, ‘or’, ‘not’, ‘if…then…’ and ‘if and only if’, which are, in symbolic form, part of the propositional languages described. There is nothing wrong with this; after all, children are taught the structure of their own language using that language. The results we shall establish about the properties of propositional and later first-order languages, and the system of formal deduction each will be associated with, are sometimes called metatheorems, because they are established using (usually rather informal) reasoning within the metalanguage. This metalinguistic reasoning itself is often called metalogical, or metatheoretic. Most of what appears in textbooks of logic is in fact metalogic, since its object is to discuss properties of formal systems. 3 ANCESTRAL TREES Consider the propositional language L[A, B, C,…; ¬, ∧, ∨, →]. We observed in Chapter 1

Propositional languages

33

that its class of sentences can be specified in a very compact way. Call any finite string of symbols drawn from the list enclosed in square brackets in L[…] an expression of L. Then an expression of L is a sentence of L if and only if it is (i) a sentence letter of L, or (ii) if it is of the form—Y, (Y∨Z), (Y∧Z), (Y→Z) where Y and Z are sentences of L. These clauses define a class of objects—sentences of L—by stating that certain things (sentence letters) are in that class unconditionally, and that certain other things are in it conditionally on their being built in a specific way out of things already registered as members. A definition which specifies a class in terms of this sort of absolute-plusconditional membership criterion is called an inductive definition. The ability of a class of entities to be defined inductively is a highly prized characteristic, because classes so defined obey a related principle of induction which can be used to elicit a range of other interesting properties shared by their members; more on this shortly. Y is called the immediate predecessor of the sentence ¬Y, and Y and Z the immediate predecessors of each of the sentences Y∧Z, Y∨Z, Y→Z. Sentence letters are decreed to have no immediate predecessors. We can depict the immediate predecessor relation graphically as follows:

where Y in the first diagram and Y and Z in the second are the immediate predecessors of X. Do not confuse these diagrams with the conjugate tree diagrams for the connectives: the latter specify truth-conditions, whereas those above convey purely structural information. If either Y or Z in the diagrams above is not a sentence letter we can continue the diagrams downwards, to include their immediate predecessors, and so on until we eventually reach sentence letters which, because they have no immediate predecessors, are terminal points of the resulting tree. By analogy with human genealogy we shall call this tree the ancestral tree of X. Here is the ancestral tree of the sentence B→(B∨(A∧¬C)):

The nodes on this tree (i.e. the junction-points of line-segments, including the initial and

Logic with trees

34

terminal points) below X are all predecessors of X; another term for these is proper subsentences of X. Thus the ancestral tree is also the subsentence tree of X. Exercises Draw the ancestral trees of the following sentences: (a) A∨¬(A→(B∨C)) (b) ¬(C∨D)∧(D∧¬C) (c) A→(B→¬(C∧¬D))

4 AN INDUCTION PRINCIPLE The sentences in any propositional language all have ancestral trees (the ancestral tree of a sentence letter A is just the single node consisting of A), a fact which is exploited in a very useful method of proving general results about these languages called the Principle of Induction on Immediate Predecessors. This says the following. Suppose P is any property which it makes sense to speak of sentences of L having or not having. Then: If (1) all the sentence letters of L have P, and (2) from the assumption that the immediate predecessors of any non-atomic sentence X in L have P, it follows that so too does X, then (3) every sentence in L has P. To see why the principle is true, suppose L is any propositional language including the sentence letters A, B, C and the connectives ∧, ∨, ¬. Let X be B→(B∨¬(A∧¬C)) for example. Look at its ancestral tree given on p. 34 above. Suppose that assumptions (1) and (2) above are satisfied. (1) tells us that A, B and C have P, and (2) tells us that we can think of a single line-segment or pair of line-segments in the ancestral tree as carrying possession of P upwards from those lower nodes to their successor node. But since, by (1), each of the terminal nodes B, B, A and C (from left to right in the tree) have P, it follows that possession of P is carried from successive level to successive level up the tree, until it is finally inherited by the topmost node X itself. Clearly, the same argument will establish that any given sentence of L has P. The Induction Principle is really no more than a roundabout way of saying that every sentence in a propositional language has an ancestral tree. But it is very useful, and we shall show by means of some examples how the principle can be used as a powerful (meta)proof-technique. We shall start with a very simple example. Consider the language L[A; ∧]. We shall show, by induction, that no sentence in L is a tautology. Let the property P in the statement of the Induction Principle be is not a tautology. First, we have to show that (1) in the statement of the principle is satisfied. This means that we have to show that every sentence letter of L is not a tautology. There is only one sentence letter in L, A, and A is not a tautology because it has the truth table

Propositional languages

35

which obviously contains at least one F; exactly one, in fact. Now we must show that (2) is satisfied. This is traditionally called the induction step. Suppose X is any sentence of L other than a sentence letter, i.e. other than A. X must therefore contain at least one occurrence of ∧, and hence be of the form Y∧Z, for some pair of sentences Y, Z of L. X’s immediate predecessors are therefore Y and Z, and we have to show that if both these have P, then so does X. Well, suppose that Y and Z have P, i.e. suppose that neither Y nor Z is a tautology. In that case, both Y and Z will have at least one F in their truth table, from which it follows that Y∧Z must have at least one F too, since we know that a conjunction is false if either conjunct is. Hence Y∧Z, i.e. X, is not a tautology. So we have shown that if X’s immediate predecessors have P, so too does X, and we can invoke the Induction Principle to infer that all sentences in L have P, i.e. are not tautologies. Q.E.D. (Q.E.D. stands for ‘quod erat demonstrandum’, i.e. ‘which was to be demonstrated’. This is the phrase, translated into Latin from the original Greek, with which Euclid signalled the end of a demonstration in his celebrated Elements. The three letters, boldface, will be used to perform the same function throughout this book.) To sum up: a proof by induction on immediate predecessors proceeds in the following three stages: (1) establish that all the sentence letters in L have P, the property in question; and (2) show that from the assumption that the immediate predecessors of some arbitrary sentence Z have P (this provisional assumption is called the inductive hypothesis), it follows that Z too must have P. If stages (1) and (2) are both successfully accomplished, then we invoke the Principle of Induction to conclude that (3) all the sentences of L have P. We shall now look at another proof by induction, in which the induction step is a socalled proof by cases. This time L is the language L[A; →], i.e. the set of all sentences which can be constructed from A and the single connective →. Let P now be the property of either being a tautology or being truth-functionally equivalent to A. (1) We first have to show that all sentence letters of L have P. Since there is only one, A, this amounts to showing that A has P. But obviously A has P, since A ⇔ A. (2) We now have to show for every non-atomic X in L, that if X’s immediate predecessors have P, then so does X itself. If X is non-atomic, this means that X must be of the form Y→Z for some sentences Y and Z of L. Y and Z are X’s immediate predecessors, so we now have to assume that both Y and Z have P, and see whether from that assumption we can show that X must have P. So let us assume that Y is equivalent to A or Y is a tautology, and the same for Z. This is our inductive

Logic with trees

36

hypothesis, and it implies that there are four exclusive and exhaustive possibilities to consider: (i) Y is a tautology and Z is a tautology; or (ii) Y is a tautology and Z ⇔ A; or (iii) Y ⇔ A and Z is a tautology; or finally (iv) Y ⇔ A and Z ⇔ A. We shall now consider these in turn. (i) We here have the truth table

Clearly, X=Y→Z is a tautology, and therefore X has P. (ii) Now we have the truth table

Here X=Y→Z is obviously equivalent to A, and so in this case too X has P. (iii) This gives the truth table

in which case X=Y→Z is a tautology. Hence in this case X has P. (iv) Here the truth table is

and so in this case X is a tautology, and so has P. We have just proved that in each of the four different possible cases, if Y and Z have P, so does X. The Principle of Induction on Immediate Predecessors now permits us to conclude that all sentences in L are either equivalent to A or are tautologies, and step (3) is accomplished. Q.E.D.

Propositional languages

37

Exercises 1 What are the immediate predecessors of the following sentences? (a) (A∧B)∨(B∧A) (b) (A∧B)∨¬B (c) A (d) A→(B→C) (e) ¬(A→(B→C)) 2 Show by induction on immediate predecessors that every sentence in L[A; ∧] and every sentence in L[A; ∨] is truth-functionally equivalent to A. 3 Show by induction on immediate predecessors that if X is any sentence in L[A; ↔] then X is either a tautology or is truth-functionally equivalent to A. 4* (Replacement Principle) Suppose X, U, V are sentences in L[A, B, C,…; ∧, ∨, ¬, →]. Define X(V/U) as follows. If U is a sub-sentence of X (i.e. U is a node in X’s ancestral tree), X(V/U) is the result of substituting V for every occurrence of U in X; if U is not a subsentence of X, X(V/U)=X. Show by induction on immediate predecessors that if U⇔V then X⇔X(V/U).

A DIGRESSION ON MATHEMATICAL INDUCTION (This can be skipped by those with an aversion to numbers.) Readers acquainted with the Principle of Mathematical Induction will find something very familiar about that of induction on immediate predecessors. This is as it should be, because at bottom both enunciate one and the same principle. The Principle of Mathematical Induction on the set N = {0, 1, 2, 3,…} of natural numbers says that if P is a property and (1) 0 has P and (2) whenever m in N has P so does m+1, then all natural numbers have P (there is an analogous principle for the positive integers Z+={1, 2, 3,…}, only (1) becomes the condition that 1 has P). Now m is the (sole) immediate predecessor of m+1, and so we can restate clause (2) as…(2'): whenever the immediate predecessor of any non-zero number n has P so does n…, and we have something which differs from the Principle of Induction on Immediate Predecessors only in the fact that truth-functional sentences can have multiple immediate predecessors, whereas each positive integer has only one. To put it another way, the ancestral tree of a sentence may and usually will branch, whereas the ancestral tree of a positive integer is a line. However, there are propositional languages whose structure is formally identical (the technical term is isomorphic) to that of the natural numbers with their successor/ predecessor structure. For example, L=L[A; ¬], where A corresponds to 0 and passing from X to ¬X in L corresponds to passing from n to n +1 in N. For those who have not seen a proof by mathematical induction before, here is a wellknown and simple one. We want to show that 1+2+3 + …+n=n(n+1)/2, for all n in Z+ (the set of positive integers). Let Sn stand for the sum of the first n positive integers, i.e. 1+2+…+n, and f(n) for the function n(n+1)/2. Let n have the property P just in case Sn =f

Logic with trees

38

(n). It is easy to show by mathematical induction that every n in Z+ has P. First we need to check that 1 has P. This is immediate, since clearly S1=1=f(1). Now for the induction step: we shall assume that m has P for some arbitrary m, and then show that m+1 has P. So we suppose that Sm=f(m). Adding m+1 to both sides, we infer that 1+2+…+ m+(m+1) =Sm+ 1=f(m)+(m+1)=(m(m+1)12)+m+1=(m (m+1)+2(m+1))/2=(m+1)(m+2)12=f(m+1). Thus from m’s having P we infer that m+1 has P, and the induction step is proved. Hence we infer that for all n in Z+, n has P. Q.E.D. 5 MULTIPLE CONJUNCTIONS AND DISJUNCTIONS In this section we shall use the Principle of Induction on Immediate Predecessors to prove a very useful metaresult about truth-functional compounds which involve only the connective ∧, and compounds which only involve ∨. As a preamble we can verify by truth tables that the following two equivalences hold for any sentences X, Y and Z in any propositional language containing ∧ and ∨ (cf. exercise 1, Chapter 1, section 4):

These two conditions amount to saying that v and A are associative: it does not matter where you put the brackets in a conjunction or disjunction of X, Y and Z: X∧(Y∧Z) is true just when all of X, Y and Z are true, and X∨(Y∨Z) is false just when all of X, Y and Z are false. For this reason the sentences above are usually written simply as X∧Y∧Z and X∨Y∨Z respectively. We shall now generalise this result. Let the set S in L[S; ∧] contain n sentence letters, which we shall write A1,…, An (these are metalinguistic symbols used simply to signify that S contains n sentence letters; no ordering of those letters is implied), and let X be a sentence in L[S, ∧], i.e. in L[A1,…, An; ∧]. Thus X is obtained by conjoining some or all of the sentence letters in S in any order, possibly with repetitions. Using the Principle of Induction on Immediate Predecessors we shall now show that for all X in L, X takes the value T in its truth table just when all the Aj in X take the value T. Let P be the property a sentence X in L has when the following condition is satisfied: X takes the value T when and only when all the Ai in X take the value T. We shall show (1) that all the Ai have P, and (2) that for non-atomic X, if X’s immediate predecessors have P then so does X itself. (1) is immediate. For to say that Aj has P is to say that Aj is T when and only when Aj is T, which is itself (trivially) true. Now for the induction step (2). Suppose that X is not an Ai Then X must be of the form Y∧Z, with immediate predecessors Y and Z for some sentences Y and Z in L. Let us suppose (the inductive hypothesis) that Y and Z each have P, i.e. they are T just when their component Ai are all T. We have to establish that the inductive hypothesis implies that X is T if and only if all its constituent sentence letters are T. (i) Suppose X is T. Then Y and Z must both be T. Hence, by the inductive hypothesis, all the sentence letters in Y and Z are all T. But these are just the sentence letters in X itself. (ii) Suppose all the sentence letters in X are T. Hence all the sentence

Propositional languages

39

letters in Y and in Z are all T. Hence, by the inductive hypothesis, Y and Z are both T. Hence X is T. So the induction step (2) is established and the result follows. Q.E.D. This result tells us that any two sentences in L containing the same sentence letters are truth-functionally equivalent. A corollary is that we do not need to introduce a new nplace connective to represent the truth function of A1,…, An which is true just when all the Ai are true; any compound of the Aj built up using just the binary connective ∧ has the same truth table, namely T when all the Ai are T, and F in every other case. A corollary is that if X contains A1,…, An then X can be written without ambiguity as A1∧…∧An: the bracketing inside X makes no difference to its truth-value. This is unlike the situation with →, for example, where A→(B→C) is not equivalent to (A→B)→C. By a similar use of the Induction Principle, we can show that a truth-functional compound X obtained by disjoining the sentences A1,…, An in any order is false just when all of A1,…, An are false, and we can therefore write X without ambiguity as A1∨A2∨…∨An. Exercises 1 Write out in full the inductive proof that if X is in L[A1,…, An; ∨] then X is false just when all the Ai in X are false. 2 Show that if X1,…, Xn and Y are truth-functional compounds, then Y is a truthfunctional consequence of the set {X1,…, Xn} if and only if Y is a truth-functional consequence of the sentence X1∧…∧Xn. 6 THE DISJUNCTIVE NORMAL FORM THEOREM Consider the following list of truth-functional equivalences (if they are not already familiar, commit them to memory now):

Historical note The equivalences X∧Y ⇔ ¬(¬X∨¬Y) and X∨Y ⇔ ¬(¬X∧¬Y) are traditionally called de Morgan’s Laws, after the nineteenth-century mathematician Augustus de Morgan. This list of equivalences shows that in principle we could make do with only negation and one other of the connectives in the set {∧, ∨, ¬, →, ↔}, for each of the remainder is definable in terms of those two (a connective C is definable in terms of others in a set Π just in case for every sentence in L[A, B, C…; C] there is an equivalent one in L[A, B,

Logic with trees

40

C…; Π]). In practice, it is convenient to have more than the bare minimum: for example, it is not immediately obvious that ¬(X→¬Y) is equivalent to X∧Y, and so while we could make do with ¬ and → to formulate conjunctions, it is simpler and clearer to add ∧ to the stock of basic vocabulary. Having too much basic vocabulary can also be inefficient. One would not in practice want to introduce a separate four-place truth-functional operator ç, where ç(A, B, C, D) is defined to be true when A and C are both true, true when A and B are false and D true, and false in all other cases: we simply do not have enough occasion to make this particular type of assertion to warrant giving it a special name. It is nevertheless nice to know that should such occasion actually arise, we should not be lost for words as long as we have ∧, ∨ and ¬. For ç(A, B, C, D) can easily be shown to be truth-functionally equivalent to a sentence in L[A, B, C, D; ∧, ∨, ¬]. Indeed, by a simple procedure we shall describe shortly, any n-place truth-functional operator ∂(A1,…, An) can be shown to be truthfunctionally equivalent to some sentence in L[A1,…An; ∧, ∨, ¬]. We shall first illustrate how the procedure works with ç(A, B, C, D). ç(A, B, C, D) was defined to be true when A and C are true, true when A and B are false and D true, and false in all other cases. So ç(A, B, C, D) is true for the following six rows, and only those rows, of its truth table:

The assertion that ç(A, B, C, D) is true is therefore equivalent to the sixfold disjunction: (A is T and B is T and C is T and D is T) or (A is T and B is F and C is T and D is T) or (A is T and B is T and C is T and D is F) or (A is T and B is F and C is T and D is F) or (A is F and B is F and C is T and D is T) or (A is F and B is F and C is F and D is T). We know that for any sentence X, X is false if and only if ¬X is true, i.e. X is F if and only if ¬X is T. So, for example, we can rewrite the second disjunct as ‘A is T and ¬B is T and C is T and D is T’, which is itself equivalent to ‘(A∧¬B∧C∧D) is T.’ We also know that for any sentences X and Y, X is T or Y is T if and only if X∨Y is T. From these facts we infer that ç(A, B, C, D) has the same truth table as the disjunction (A∧B∧C∧D)∨(A∧¬B∧C∧D)∨(A∧B∧C∧¬D)∨(A∧¬B∧C∧¬D)∨ (¬A∧¬B∧C∧D)∨ (¬A∧¬B∧¬C∧D)

Propositional languages

41

and hence is truth-functionally equivalent to it. We can easily generalise this procedure. Suppose X is a sentence in a propositional language whose sentence letters are A, B, C,…For each row of X’s truth table, write out a corresponding conjunction ±A∧±B∧±C∧ …, where ±A is defined to be A if A takes the value T at that row, and is ¬A if A takes the value F at that row; similarly for ±B, ±C, etc. (the alphabetical ordering of A, B, C, etc. in the conjunctions is quite arbitrary; any other could be chosen instead). Now form the disjunction of all these conjunctions which correspond to T rows of X’s truth table. This disjunction is a sentence in L[A, B, C,…; ∧, ∨, ¬], which by the reasoning above is truth-functionally equivalent to X. This construction obviously presupposes that X takes the value T on at least one row of its truth table; if X doesn’t, i.e. if X is a contradiction, then X is equivalent to A∧¬A, which is, of course, also a sentence in L[A, B, C,…; ∧, ∨, ¬]. We have in effect proved the following theorem: Theorem 1 (Disjunctive Normal Form Theorem) Suppose X is a sentence in a propositional language L with n sentence letters, which we shall denote by A1,…, An. If X is not a contradiction, then it is truth-functionally equivalent to a disjunction of conjunctions of the form ±A1∧…∧±An, where+Ai=Ai,−Ai=¬Ai. Corollary Any truth-functional assertion is equivalent to one which uses just the connectives ∧, ∨ and ¬. Another way of stating the corollary is that every sentence in the language L in the theorem is truth-functionally equivalent to a sentence in L[A1,…, An; ∧, ∨, ¬]. This is one of the fundamental results of (meta)logic. It tells us that ∧, ∨, ¬ are the most one needs in terms of truth-functional operations to formulate any truth-functional assertion, whereby a truth-functional assertion is meant one which can be evaluated as true or false by means of a truth table. Nor is this set of connectives the smallest with this ‘universal’ property: we know that either disjunction or conjunction can be defined in terms of each other together with ¬; indeed, as we shall see in the next section, we can actually find a single binary connective with the same ‘universal’ property. A non-contradictory sentence X which is already a disjunction of conjunctions ±A1∧… ∧±An, all with the same number n of conjuncts, is said to be in Disjunctive Normal Form (DNF). When X is a sentence not in DNF, then by the theorem above it is equivalent to one that is. There will be as many of these DNF equivalents of X as there are sentences representable in this form in L. Recall that there will be infinitely many of these in general, allowing for differences in bracketing, the order in which the sentence letters occur, and possible repetitions of sentence letters (see pp. 39–40). We could nevertheless select one of these to be the canonical representative, call it DNF(X), as the disjunctive normal form of any sentence X in L, if we wished. One way might be as follows. Order the sentence letters in X, say alphabetically, and take as DNF(X) that disjunction in L whose disjuncts D1, D2, D3…are of the form ((±A∧±B)∧±C)∧…and then write the

Logic with trees

42

disjuncts similarly as (((D1∨D2)∨D3)∨…. Any other choice of canonical representative would in principle be just as good, however, and we shall simply write DNF(X) as a multiple disjunction D1∨D2∨D3…, without internal pairwise bracketing, of multiple conjunctions ±A±B±C…, also without internal pairwise bracketing. Example Let X be the sentence C→¬(B∨A). To find DNF(X), first construct the truth table for X:

X is true at rows 2, 5, 6, 7 and 8. These rows will determine DNF(X). Since the letters in the conjuncts of DNF(X) will appear in the order A, B, C, we have DNF(X)=(A∧B∧¬C) ∨ (¬A∧¬B∧C) ∨ (¬A∧B∧¬C) ∨ (A∧¬B∧¬C) ∨ (¬A∧¬B∧¬C). Exercises 1 If X contains k sentence letters, what is the maximum number of disjuncts DNF(X) can possess? 2 Find the Disjunctive Normal form of each sentence below. (i) A→B (ii) A∨B (iii) A∧B (iv) A↔B (v) A¬A∨B (vi) A∧¬(B∨¬(C→¬A)) (vii) ¬(B∨A)

7 ADEQUATE SETS OF CONNECTIVES A set of connectives in terms of which all truth-functions can be defined is said to be adequate for truth-functional logic, or just adequate, for short. The Disjunctive Normal Form Theorem tells us that the set {∧, ∨, ¬} is adequate. So, of course, is any set which includes these connectives. Also, in view of the fact that ∨ (respectively ∧) is definable, by de Morgan’s Laws, in terms of ¬ and ∧ (respectively ∨), we infer that both sets {¬, ∧}

Propositional languages

43

and {¬, ∨} are adequate. Because X∧Y ⇔ ¬(X→¬Y), and X∨Y ⇔ ¬X→Y, we know that {→, ¬} is also adequate. Surprisingly, we can find a binary connective which by itself turns out to be adequate. In fact, we can find two, symbolised | and ↓, whose truth tables are these:

| is sometimes called alternative denial, because A|B is true just when one at least of A and B is false, and ↓ is sometimes called joint denial, because A↓B is true just when both A and B are false. Other names for | and ↓ are ‘nand’ and ‘nor’ respectively (we shall see why shortly). It might not seem obvious, looking at these truth tables, that we can define negation in terms both of | and ↓. But, as can easily be checked with a truth table, if X is any sentence

We can also see from the truth table for | that

which is why | is called ‘nand’, nand abbreviating ‘not-and’. Hence

and substituting A|B for X in (5) we get

In other words, we have shown that both ¬ and ∧ are definable in terms of |. Hence, since {¬, ∧} is an adequate set, | must also be adequate. Now for ↓. From its truth table we can see that

which is why ↓ is often called ‘nor’, i.e. ‘not-or’, or for that matter the ordinary English ‘nor’. Hence

Logic with trees

44

and so, by (5) again,

So both ¬ and ∨ are definable in terms of ↓; hence, since {¬, ∨} is an adequate set, ↓ is adequate. Exercises 1 Write down the (unsigned) conjugate tree diagrams for | and ↓. 2 Find a sentence X in L[A, B; |] such that A∨B ⇔ X, and a sentence Y in L[A, B; ↓] such that A∧B ⇔ Y. 3 Find sentences X in L[A, B; |] and Y in L[A, B; ↓] such that A→B ⇔ X and A→B ⇔ Y. 4 Which of the following are tautologies, contradictions and neither? (i) (A|B)|(B|A) (ii) (A↓(A↓A))↓(A↓(A↓A)) (iii) (A|(A|A))|(A|(A|A)) (iv) (A|B)↓(A|B) 5 Find the DNFs of (i) (A|B)|(B|A) (ii) A↓(B↓C) 6 Explain why it makes no sense to write A|B|C or A↓B↓C. 7* Show by induction on immediate predecessors that if X is any sentence in L[A, B; ↔, ¬], then X has 0 or 2 or 4 Ts in its truth table (i.e. in the truth table with four rows determined by the four truth-value distributions over A and B). Explain why this shows that {↔, ¬} is not adequate. Hint: think of conjunction and disjunction.

8* THE DUALITY PRINCIPLE An interesting and elegant result that can be easily proved using the Principle of Induction on Immediate Predecessors is the Duality Principle: Theorem 2 (Duality Principle) Let X be any sentence in L[A1,…An; ∧, ∨, ¬]. Let X* be obtained from X by replacing every occurrence of ∧ in X by ∨, every occurrence of ∨ by ∧, and every occurrence of A1 by ¬Ai Then X⇔¬X*. (X* is called the dual of X.) Proof A sentence X of L, where L is as in the theorem, will be said to have the property P if

Propositional languages

45

X⇔¬X*. We shall prove by induction on immediate predecessors that all sentences of L have P. So we have to establish that the following two conditions are satisfied: (1) each Ai has P; and (2) for any non-atomic X, from the inductive hypothesis that the immediate predecessors of X have P, it follows that X does also. (1) Each Ai clearly has no occurrence of ∨ or ∧, and so Ai* is just ¬Ai. So showing that Ai has P merely requires showing that Ai⇔¬¬Ai, which we know to be the case. (2) The induction step is an argument by cases. If X is not an Ai then X must have one of the following three forms: (i) X=Y∨Z, (ii) X=Y∧Z, or (iii) X=¬Y where Y and Z are sentences of L. If X is of the form (i) or (ii) it has as immediate predecessors Y and Z, while if it is of the form (iii) it has the one immediate predecessor Y. We shall check that the induction step holds in each of the cases. (i) Suppose that Y and Z each have P, i.e. that Y⇔¬Y* and Z⇔¬Z*. This supposition, recall, is the inductive hypothesis. From this we infer that Y∨Z ⇔ ¬Y*∨¬Z* (exercise 1 below). By de Morgan’s Laws ¬Y*∨¬Z* ¬ ⇔(Y*∧Z*). But Y*∧Z*=(Y∨Z)* (exercise 2 below), and Y∨Z=X. So we have shown that the inductive hypothesis implies that X⇔¬X*, i.e. X has P as required. (ii) We have the same inductive hypothesis as in (i). So again Y⇔¬Y* and Z⇔¬Z*. Hence Y∧Z ⇔ ¬Y*∧¬Z*. By de Morgan again, ¬Y*∧¬Z* ⇔ ¬(Y*∨Z*). But Y*∨Z*= (Y∧Z)*=X*. So X⇔¬X* in this case too. (iii) Here the inductive hypothesis is simply that Y⇔¬Y*. Hence ¬Y⇔¬¬Y*. But ¬Y*=(¬Y)*=X*. Hence X⇔¬X*. Q.E.D. Exercises 1 Show that if X⇔Y and V⇔W then (a) X∧V⇔Y∧W (b) X∨V⇔Y∨W 2 For any sentences Y, Z in the language L[A1,…, An; A, ∨, ¬], explain why (a) ¬Y*=(¬Y)* (b) Y*∧Z*=(Y∨Z)* (c) Y*∨Z*=(Y∧Z)* (Imagine you are a computer programmed with the instructions for converting a sentence X in L into its dual. The program performs its function by working from left to right through X, examining every symbol, changing it appropriately or leaving it unaltered.) 3 Show that if X⇔Y then X*⇔Y*

9* CONJUNCTIVE NORMAL FORMS An important application of the Duality Principle is in finding what are called the Conjunctive Normal Forms of truth-functional compounds. Define a literal to be a sentence letter or the negation of one. DNF(X), where X is any non-contradictory

Logic with trees

46

sentence, is a disjunction of conjunctions of literals. Now it is easy to show from the Disjunctive Normal Form Theorem and the Duality Principle that X is also equivalent to a conjunction of disjunctions of literals. Since for any sentence X in any language L, X ⇔ DNF(X), we must have ¬X ⇔ DNF(¬X). DNF(¬X) is of course a sentence whose connectives are only ∧, ∨ and ¬, and so by Duality, DNF(¬X) ⇔ ¬(DNF(¬X))*, where as before (DNF(¬X))* is the dual of DNF(¬X). Hence we have ¬X ⇔ ¬(DNF(¬X))*, and so X ⇔ (DNF(¬X))*. If we now eliminate any double negation ¬¬ that might appear in (DNF(¬X))* we obtain a conjunction of disjunctions of sentence letters and their negations, and this conjunction is called the Conjunctive Normal Form CNF(X) of X. For example, suppose X is the compound ¬(A∧(B→C)). As a first step, write out DNF(¬X). This is (A∧B∧C) ( (A∧¬B∧¬C) ∨ (A∧¬B∧C) (as in 6, we are assuming that the ordering of sentence letters relative to which the DNF’s are defined is alphabetical). Dualising, and eliminating double negations, we get CNF(X)= (¬A∨¬B∨¬C) ∧ (¬A∨B∨C) ∧ (¬A∨B∨¬C). To sum up: to find the CNF of a compound X you (i) find DNF(¬X), (ii) obtain its dual (DNF(¬X))*, and (iii) eliminate all double negations from (DNF(¬X))*; the result is CNF (X). If X is a tautology then its negation is a contradiction, and the DNF of ¬X is, according to the convention agreed earlier, A∧¬ A. By steps (ii) and (iii) above, CNF(X) =¬A∨A. Another example Let X be (C→¬(B∨A))∧¬(C∧A). DNF(¬X)=(A∧B∧C) ∨ (¬A∧B∧C) ∨ (A∧¬B∧C). Hence CNF(X)=(¬A∨¬B∨¬C) ∧ (A¬B∨¬C) ∧ (¬A∨B∨¬C). Conjunctive Normal Forms are very important in logic programming, where the inferential technique of resolution works on sentences cast into this form. Exercises Write out CNF(X) for each sentence X below: 1 A→B 2 A∧B 3 A→(B∨C) 4 A→(B→A)

Chapter 4 Soundness and completeness 1 THE STANDARD PROPOSITIONAL LANGUAGE We have already had some experience, in Chapter 2, of constructing truth trees. In this chapter we shall establish the procedure on a systematic basis and give rigorous proofs of their fundamental properties, namely that if all the members of a finite set of truthfunctional sentences are satisfiable, i.e. true for some distribution of truth-values over their sentence letters, then they generate a finite open tree, while if they are not satisfiable they generate a closed one. These results establish a relationship between a syntactic property of a finite set Σ of sentences, its generating a closed or open tree (this is a syntactic property because, as we shall see, rules can be given for tree construction of a purely formal character), and a semantic one, satisfiability. To prove them we have to state more precisely what form the rules of tree construction take. To begin with, we need to confine the discussion to a specific propositional language. Because our aim is a model of deductive reasoning of as great a generality as can be achieved, this propositional language should be universal, in the sense that any truth-functional inference can be represented in it. We know from the Disjunctive Normal Form Theorem that the truth-functional structure of all the sentences appearing as premises and conclusion in any inference can be translated into sentences of a propositional language L[S; Π] in which Π is the set {∧, ∨, ¬). So any propositional language with those connectives and in which there is an unlimited supply of sentence letters will be universal in the sense we require. To give ourselves something a bit more like the luxury of ordinary language (which is what, after all, we are modelling), we shall add → to the set {∧, ∨, ¬}. What we shall now do is select an arbitrary one of these universal languages, which differ only in their sentence letters, and call it the standard propositional language; in the rest of this chapter it will be referred to as L[A, B, C,…; ∧, ∨, ¬, →] or just L. ↔ is not one of L’s connectives; for technical reasons which will become apparent it is convenient to do without it. However, it is of course there implicitly, and we shall continue to write X↔Y in appropriate circumstances, regarding this as merely shorthand for one of its truth-functional equivalents in L, like (X→Y)∧(Y→X), for example.

2 TRUTH TREES AGAIN We can ‘collapse’ all the conjugate diagrams for ∧, ∨, → into just two in the following way. First, we rearrange them into two groups as follows:

Logic with trees

48

(i)

(ii) We shall classify the upper sentences in (i) and (ii) as a and β sentences as follows (with a minor variation this is Smullyan’s (1968) classification): An α sentence of L is of the form either X∧Y, ¬(X∨Y) or ¬(X→Y), where X and Y are sentences of L. For each of these three types of a sentence we define a corresponding pair of sentences α1, α2:

A β sentence of L is of the form X∨Y, X→Y, or ¬(X∧Y), where X and Y are sentences of L. For each of these three types of β sentence we define corresponding pairs of sentences β1, β2:

The two groups of diagrams above can now be represented by just the two diagrams:

These diagrams will henceforth be known as the tree rules (α) and (β) respectively; the pairs of sentences α1, α2 and β1, β2 will be called the descendants of a and β respectively under the rules. Together with the Rule of Double Negation

Soundness and completeness

49

these will be all the rules we shall employ in constructing truth-functional truth trees. A simple result which we shall find useful and which can be left to the reader to check is the following: Theorem 1 If α is any α sentence and β is any β sentence then α⇔α1∧α2 and β⇔β1∨β2. Theorem 1 explains why α sentences are sometimes called conjunctive sentences, and β sentences disjunctive sentences. Another useful result we shall need is this: every sentence in L is either a literal, or a sentence commencing with at least one pair of consecutive occurrences of a negation symbol, or an a sentence, or a β sentence. It’s hardly necessary to dignify this with the title ‘theorem’, and we shall leave its demonstration an exercise (you just need to note that every sentence in L is either a sentence letter or else is a sentence of the form ¬X, X∧Y, X∨Y or X→Y, with outer brackets omitted as usual). Before proceeding to a formal statement of the rules of tree construction, we shall briefly review some general tree-concepts. The nodes (see Chapter 3, section 3) are the end-points and junction-points of linesegments. The topmost node is the root of the tree. The nodes on an ancestral tree are all single sentences; those on a truth tree, by contrast, may also be constituted by sets of sentences: the pair {α1, α2}, appearing on a tree without the set brackets {} and with α1 conventionally written above α2, is a single node, as is the set of initial sentences which will, unless otherwise indicated, form the root of the tree. A branch in a tree is the sequence of nodes in any continuous path along linesegments upwards from the lowest node to the root, including the end-points. We can regard a single node as a degenerate branch. Of course, we have not yet formally defined a tree. So far we have relied on the following informal characterisation. A finished tree generated by a finite set Σ of initial sentences is the entity generated by applying a tree rule to an unused sentence in Σ (if there is one; if there isn’t, Σ itself is the finished tree generated by Σ), and continuing to apply the appropriate rule to every unused sentence on every branch generated, closing a branch as soon as a sentence and its negation appear on it. When either every branch has closed or there are no further sentences that can be used, the tree is finished. A defect of this definition, apart from its informality, is that it fails to highlight one of the most important features of truth trees, namely the fact that they can be constructed in a completely mechanical way. Another way of putting this is to say that their construction can be made entirely algorithmic, or programmable on a suitably idealised computer. Here is a ‘proto-program’ for constructing a tree from Σ: Start. Write the members of Σ in a column. Check that no sentence and its negation are in Σ. If there is a sentence and its negation in Σ, draw a line under Σ, and stop: the tree consisting of just Σ is said to be finished and closed. If there is no sentence and its negation in Σ, check that there is at least one non-literal in Σ (as in Chapter 2, section 9*, a literal is any sentence letter or negation of a sentence letter). If there is not, stop: the tree consisting just of Σ is finished and open. If neither of these eventualities is the case, select the topmost non-literal X in Σ. We know that X is either a sentence commencing with more than one negation symbol, or else is an a or β sentence. We deal with these

Logic with trees

50

cases in turn. (i) If X is a sentence commencing with more than one ¬, we can write it ¬¬Y; now apply the double negation rule to X, i.e. write

directly below Σ. X has now been used. (ii) If X is an a sentence, apply the (α) rule to X; i.e. write

directly below Σ, where α1, α2 are the descendants of X under that rule. X has now been used. (iii) If X is a β sentence, apply the (β) rule to X; i.e. write

below Σ, where β1, β2 are the corresponding descendants of X under that rule. X has now been used. Now repeat the following sequence of instructions, enclosed within < and > brackets, in the order they appear, until no further repetition of the sequence is possible. When that happens, stop. The result is a finished tree generated from Σ. We called this set of instructions a proto-program because while it is not a computer program, it can be turned into one. When started up it will alway halt after finitely many repeats, whatever the set of initial sentences so long as that set is finite. This is because each of the rules (α) and (β) applied to a sentence eliminates a binary connective, while the Rule of Double Negation ensures that at some stage in the evolution of the tree every sentence on it commencing with a consecutive sequence of ¬’s will be reduced either to a literal, or else to an a or β sentence. As there are only finitely many occurrences of a connective in any sentence, it follows that after finitely many stages of the program all the sentences in Σ will have been decomposed into literals, or else the tree will have closed at some earlier point. As we pointed out in Chapter 2, neither the numbering of lines on the tree nor the

Soundness and completeness

51

ticking of sentences on it as they are used is integral to the tree: they merely help the treegrower keep stock of what they are doing. Henceforward we shall number lines where it is is obviously helpful, though instead of ticking sentences as they are used we shall state at each line, where it is not obvious, which line number was used to obtain that line. Exercises 1 For each of the following sentences, say whether it is an α sentence, a β sentence, a literal, or none of these. (i) A→(B→C) (ii) ¬¬A (iii) (A∧B)∨¬(A∧¬C) (iv) ¬B (v) ¬(B→(¬C∨¬¬D)) (vi) ((C∨¬D)∨¬(D∧B))∧¬(A∨¬A) 2 Construct finished trees for the following sets of sentences, and say whether the trees are open or closed. (i) {A→B, B→C, ¬C, A} (ii) {A→B, B→C, ¬C, ¬A) (iii) {A∨B, ¬B∨C, ¬A, ¬C} (iv) {A∨B, ¬B∨C, ¬A, C} 3 Show that every sentence in L is either a literal, or a sentence whose initial symbols are a block of more than one negation symbol, or else is an α or β sentence.

3 TRUTH-FUNCTIONAL CONSISTENCY, TRUTH-FUNCTIONALLY VALID INFERENCES, AND TREES Suppose we have generated a finished tree from some finite set Σ of initial sentences. There are two possibilities: (i) the tree is closed, and (ii) it is open. We have already been told how to interpret open and closed trees in terms of the satisfiability or not respectively of their sets of initial sentences. This interpretation is justified by a pair of metatheorems which will be proved shortly, the Soundness and Completeness Theorems for truthfunctional trees, whose names we are already familiar with but whose precise statement will now be useful: Soundness Theorem If τ is a truth-value distribution over sentence letters which makes all the members of Σ true, then every finished tree generated by Σ is open, and all the sentences on each open branch are made true by τ.

Logic with trees

52

Completeness Theorem If Σ generates a finished open tree, then there is a distribution of truth-values to sentence letters which makes all the members of Σ true; indeed, any truth-value assignment which makes all the literals on each open branch true is one. Hence if (i) is the case, the Soundness Theorem tells us that there is no truth-value distribution over the sentence letters appearing in Σ which satisfies Σ, i.e. which makes all the sentences in Σ true (this is simply the contrapositive form of the Soundness Theorem). If (ii) is the case, the Completeness Theorem tell us that assigning the value T to each literal on any open branch, and any values whatever to the remaining sentence letters in Σ gives a truth-value distribution over the sentence letters in Σ which satisfies Σ. We shall say that a set of sentences in any propositional language is truth-functionally consistent if there is a distribution of truth-values over its sentence letters which makes all its sentences true (this is the property referred to earlier as satisfiability). If there is not, then it is truth-functionally inconsistent. The Soundness Theorem and Completeness Theorem tell us that a finite set of sentences in the standard propositional language Is truth-functionally consistent if and only if it generates a finished open tree. We now list some further consequences of these theorems. (a) Whether a tree generated from a finite set Σ closes or not does not depend on the order in which you use the unused sentences which appear at any stage in the development of the tree; you will usually get different trees, but either all those trees close or all remain open, depending only on whether Σ is truth-functionally consistent or not. If Σ is truth-functionally consistent, then from Soundness it follows that any finished tree generated from Σ has an open branch. The Completeness Theorem implies that if any tree generated from Σ has an open branch, then Σ is truth-functionally consistent; hence if Σ is truth-functionally inconsistent any tree generated from Σ closes. (b) If an inference is truth-functionally valid then the set Σ consisting of its premises and the negation of its conclusion is truth-functionally inconsistent (to show this is left as exercise 1 below), and the Completeness Theorem implies that any tree generated by Σ will close. We shall say that there is a tree proof of the conclusion of an inference from its premises if a closed tree can be generated from its premises and the negation of its conclusion. Thus the Completeness Theorem implies that if an inference is truthfunctionally valid then there is a tree proof of its conclusion from its premises (this is the sense of ‘completeness’ to which the name of the theorem refers). (c) If an inference is truth-functionally invalid then the set Σ consisting of its premises and the negation of its conclusion is truth-functionally consistent. Hence by the Soundness Theorem any tree generated by Σ will remain open, and by the Completeness Theorem every truth-value distribution over sentence letters obtained by assigning all the literals the value T on each open branch is a counterexample to the inference. (d) If the set Σ consisting of the premise and negation of conclusion of an inference generates a finished open tree, then every counterexample to the inference can be obtained as some open branch. For suppose τ is a distribution of truth-values to the sentence letters of the inference which makes all the sentences in Σ true. Then there is an open branch in any finished tree generated by Σ on which all the sentences, and hence all the literals, are true under τ.

Soundness and completeness

53

Now for some examples. We shall start by showing that the inference in exercise 3(ii), section 8, Chapter 1, is truth-functionally valid. The inference, formalised

has a tree proof as follows (opposite):

Now look at the inference of exercise 3(i), section 8, Chapter 1. Formalised in the standard propositional language, it becomes

The finished open tree below shows that it is truth-functionally invalid. The justification for each line has been omitted; as an exercise the reader should supply it.

Logic with trees

54

The tree is finished, with one open branch. The Completeness Theorem tells us that if the value T is asigned to each of the literals on this branch, the initial sentences are themselves assigned the value T. In this way, the branch determines a counterexample to the inference. In fact, it determines two, since it specifies the truth-values only on A, B, C and E, namely A–F, B–F, C–T, E–F, but it does not determine that of D. Hence the open branch determines the two distributions A–F, B–F, C–T, D–T, E–F and A–F, B–F, C–T, D–F, E–F, both of which are counterexamples to the inference. Below are some simple truth-functionally valid inferences, known for so long that they have acquired classic status (and names, in the case of the first two). We have already made the acquaintance of some of them. Though not essential, it is useful to remember them. Modus ponens: Modus tollens: Hypothetical syllogism: Disjunctive syllogism: Importation: Contraposition:

Exercises 1 Construct tree proofs for hypothetical syllogism and importation. 2 Give tree proofs of the truth-functional validity of the following (i) A, ¬A ∴B (ii) B ∴A∨¬A 3 State which of the following inferences are truth-functionally valid, and justify your statements by constructing appropriate trees, using the sentence letters indicated. For

Soundness and completeness

55

any which is not valid, list the counter example to it. (a) Tom will go to the show (A) only if Amanda will go (B), but Amanda won’t go unless Henry will go (C). If Henry won’t go, therefore, neither will Tom. (b) A person is entitled to benefit (A) only if either they are unemployed (B), or they are over 60 (C) and they have a disposable income of less than £10, 000 per year (D). Therefore, if they have an income of less than £10, 000 per year and are over 60 they are entitled to benefit. (c) If the Ukraine secedes from the treaty (A), and allies itself with Poland (B), then Georgia will ally itself with Russia (C). Georgia won’t ally itself with the Baltic republics (¬D) if the latter support economic decentralisation (E), and if Georgia allies itself with Russia, then the Baltic republics will support economic decentralisation or ask for help elsewhere (F). Therefore if the Ukraine secedes from the treaty and Georgia allies itself with the Baltic republics, then either the Baltic republics will ask for help else where or the Ukraine won’t ally itself with Poland. 4 From the open trees generated by each of the following sentences, identify the distributions of truth-values over the sentences letters which make the sentences true. (a) B→¬(A∧¬C) (b) ¬(D∧(C→(D∨C))

4* SOUNDNESS AND COMPLETENESS THEOREMS We shall now prove the Soundness and Completeness Theorems for truth-functional trees. The Soundness Theorem requires the following lemma (a lemma is merely a preliminary result): Lemma 1 Suppose that B is a branch on a tree. If all the sentences on B are true under some distribution of truth-values to their sentence letters, then B is open. Proof If B were closed it would contain a sentence and its negation; and both cannot be true together. Theorem 2 (Soundness Theorem for Propositional Truth Trees) Let Σ be a finite set of sentences in L. If Σ is satisfiable by a truth-value distribution τ over its sentence letters, then any finished tree generated by the rules of tree construction has an open branch, all the sentences on which take the value T under τ.

Logic with trees

56

Proof Suppose that a finished tree has been constructed from Σ and that there is some distribution τ of truth-values to the sentence letters in Σ which makes all the sentences in Σ true. We shall find an open branch in the tree. Let B be the branch-segment consisting just of Σ (B is therefore just a single node). By Lemma 1 B is open. There are two possibilities: Σ contains at least one non-literal, or it does not. If it does not, B is a finished open branch all of whose sentences take the value T. If B does contain a non-literal, then B has an extension B' in the tree generated by the application of one of the rules (α) or (β) or Double Negation to some non-literal on B, such that all the sentences on B' are T under τ. For suppose rule (α) was applied. The sentences on B' are those in Σ and α1 and α2. But by assumption all the sentences on B take the value T under τ. Hence by Theorem 1 both α1, and α2 will be true under τ, and so all the sentences on B' will take the value T. If rule (β) was applied, Theorem 1 tells us that one of β1 and β2 must also be T under τ. Suppose, without loss of generality, that it is β1. In this case let B' be that extension of B whose nodes are Σ and β1. Again, all the sentences on B' take the value T. Finally, if the Rule of Double Negation was applied to a sentence ¬¬X, let B' be the extension of B which includes X. Since by assumption ¬¬X takes the value T so does X, and so all the sentences on B' take the value T under τ. In each case, since B' contains only sentences true under τ, B' is open, by Lemma 1. If B' is not finished then it has an extension in the tree obtained by applying one of the tree rules to a sentence on B'. Proceeding as before, we construct another open branch B" which extends B'. Continuing in the same way we obtain a sequence of open branches B, B', B",…in the tree, each of which extends its predecessors. The sequence terminates at some finite stage, and when it does so we have a finished open branch in the tree, as required. Q.E.D. Note This proof is analogous to a proof by mathematical induction on the positive integers (Chapter 3, section 4). It can indeed be converted into one explicitly, though to do so hardly adds to its force. The proof of the Completeness Theorem requires another four easy lemmas. Lemma 2 Suppose B is an open branch in a finished tree and let X be any sentence on B. Then, if X is an a sentence, both α1 and α2 will appear on B below X. If X is a β sentence then either β1 or β2 will appear on B below X. If X is of the form ¬¬Y then Y will appear on B below X. Lemma 2 is an immediate consequence of the rules of tree construction. For the next three lemmas we require a couple of definitions. For each connective in the standard propositional language we define its weight as follows: the weight of ¬ is 1, and that of each of ∧, ∨ and → is 2. The degree of a sentence X in the standard propositional language is now defined to be the sum of the weights of each connective occurring in X, counting all repetitions as separate occurrences. So, for example, the degree of A is 0, of ¬¬¬A is 3, of B∨¬B is 3, of C→(¬C→(A∨¬¬B)) is 9, of ¬(¬A∨B) is 4, etc. The degree of X is a property of the formal structure of X itself; in particular, the fact

Soundness and completeness

57

that ¬¬A is truth-functionally equivalent to A does not mean that the degree of ¬¬A is 0; it is of course 2. Lemma 3 For any α sentence X in L, the degrees of α1 and α2 are each less than the degree of X, and if X is a β sentence, the degrees of β1 and β2 are each less than the degree of X. The proof of Lemma 3 consists simply in checking that the α and β rules always eliminate a binary connective. The next lemma introduces us to another type of inductive argument, sometimes called strong induction. Lemma 4 Let ∆ be any set of L sentences and k any integer. Suppose that (1) all the sentences of degree ≤ k in ∆ have some property P, and (2) where X is any sentence of degree > k in ∆, if all sentences in ∆ of lower degree have P so does X. Then all the sentences in ∆ have P. Proof of Lemma 4 Suppose that (1) and (2) in the statement of the lemma are satisfied. By (1) all sentences of degree ≤ k have P. Now let k' be the smallest number greater than k such that there are degree k' sentences in ∆. By (2) all these sentences have P. Proceeding in this way, through the ∆-sentences of next highest degree, and then the next highest after that, and so on, we shall eventually infer that a sentence of any given degree will have P. Since every sentence in ∆ has some degree, it follows that all the sentences in ∆ must have P. Q.E.D. This Induction Principle is superficially unlike the one we encountered in the previous chapter, in that instead of the induction step linking each element with a suitably defined immediate predecessor (or predecessors, if there is more than one), here the induction step (2) links each element with all those ‘predecessors’ determined according to the criterion of having lower degree. Lemma 5 The only sentences of degree 0 or 1 in L are the literals of L. The proof is very simple and is left to the reader. Theorem 3 (Completeness Theorem for Propositional Trees) Let Σ be a finite set of sentences of L. Every finished open branch in a tree generated from Σ determines a truth-value distribution over the sentence letters in Σ which satisfies all the sentences in Σ.

Logic with trees

58

Proof Suppose that a finished open tree has been generated from Σ, and let B be an open branch on the tree. We shall show by strong induction on degrees that all the sentences on B take the value T under some distribution τ of truth-values over the sentence letters of Σ. We do this by first of all supposing τ to be any distribution of truth-values to the sentence letters in Σ such that every literal on B takes the value T. There does indeed exist such a distribution, because B is open, by assumption, and so there is no pair C, ¬C of literals on B. Second, we identify the set ∆ in Lemma 4 with the set of all sentences on B, and finally we define the property P of sentences in that lemma as follows: a sentence X in ∆ has P just in case X is assigned the value T by τ. We shall now prove using Lemma 4 that all the sentences on B take the value T under τ. To establish the step (1) required by the lemma, note that there is at least one literal on B (since B is open and finished). Hence we infer that the lowest degree of sentences in ∆ is 0 or 1. But we know that all the sentences on B of degree 0 or 1 are literals (by Lemma 5), and are T under τ, by assumption, and so step (1) is established. Now for the induction step (2). Consider any sentence X on B of degree greater than 1. We shall suppose (inductive hypothesis) that every sentence on B of degree lower than that of X is true under τ, and show that from that assumption it follows that X is assigned T by τ. The reader should verify that X is either (i) an a sentence or (ii) a β sentence, or else is (iii) a multiply negated sentence (i.e. one commencing with at least two consecutive occurrences of ¬). We shall establish the induction step for each of the cases (i)–(iii). (i) If X is an α sentence, then by Lemma 1, α1 and α2 are also on B, and by Lemma 3 both are of degree less than X. So by the inductive hypothesis they are both T under τ. Hence by Theorem 1 so is X. (ii) If X is a β sentence then by Lemma 1 one of β1 β2 is on B, and by Lemma 3 both these are of degree less than X. By the inductive hypothesis, therefore, X is T under τ, and again by Theorem 1 so is X. (iii) If X is multiply negated it has the form ¬¬Y, and at some point, since B is finished, the Rule of Double Negation was applied to X. Hence Y is on B. But Y has smaller degree than X, and so by the inductive hypothesis Y is T under τ. Hence ¬¬Y is T, i.e. so is X. In each of the three possible cases, therefore, X is T under τ. The induction step (2) is now complete, and so by Lemma 4 every sentence on B is T under τ. In particular, all the members of Σ are T under τ, since they are all on B. Q.E.D. Why are Soundness and Completeness so-called? It has already been explained that the Completeness Theorem is so called because it implies that every truth-functional consequence of a set of premises can be shown to be a consequence by a tree proof. The Soundness Theorem is so called because it implies that if there is a tree proof of a sentence X from a set of premises Σ, then X is a truthfunctional consequence of Σ; the tree proof, in other words, cannot prove something to be a consequence without its really being one (a theory of formal proof with this property is traditionally called sound).

Part II First-order logic

Chapter 5 Introduction 1 SOME NON-TRUTH-FUNCTIONAL INFERENCES Consider the following inference (it is a type known as an Aristotelian syllogism): (S) All Cretans are liars. All liars are wicked. ∴ All Cretans are wicked. Historical note ‘All Cretans are liars’ was a famous, some would say infamous, remark uttered, according to St Paul’s Epistle to Titus, by Epimenides the Cretan; its self-refuting nature inspired a debate about the nature of truth which continues to this day and whose recent progress we review in Chapter 11. Everyone from Aristotle onwards has taken (S) to be a paradigmatic example of a deductively valid inference. However, it is not truth-functionally valid: none of the three sentences making up the inference is a truth-functional compound of anything but itself, so within a propositional language each would have to be represented by distinct sentence letters, say A, B and C respectively. (S), in other words, has the truth-functional form A B ∴C which is truth-functionally invalid, since we have only to make the assignment T to A, T to B and F to C to get a trivial truth-functional counterexample. Of course, these can’t be real truth-values if (S) is valid, since then it would be impossible for the two premises to be true and the conclusion false. A–T, B–T and C–F is, however, a consistent distribution of Ts and Fs over the sentence letters A, B and C, indicating that if (S) is valid, then it is not valid as a function of its truth-functional structure alone. And (S) certainly is valid. A simple pictorial method of demonstrating its validity, and also that of the other valid Aristotelian syllogisms, was discovered by the German mathematician Euler in the eighteenth century and refined by the English mathematician John Venn in the nineteenth. Euler’s method is very well known (Venn’s refinement rather less so; for a brief account of it see Kneale and Kneale 1962:421). First replace the specific class terms ‘Cretan’, ‘liar’ and ‘wicked (person)’ by non-specific, schematic ones, say P, Q and T. P, Q and T are now represented by circles drawn inside a rectangular box D representing a universe of discourse. To say that some Ps in D are Qs means that there are things in D in the intersection of P and Q, signified by placing an asterisk in that intersection (see Figure 1 (a)); to say that no Ps are Qs means that the

Logic with trees

62

circles do not intersect (see Figure 2 (b)) and to say that all Ps are Qs means that the circle P is either wholly contained within, or coincides with, the circle Q (see Figure (c); the fact there is no asterisk in that part of the interior of Q which is not included in that of P leaves it open whether P coincides with Q or whether there are Qs which are not Ps). In an Euler diagram which makes the premises of the syllogism (S) true (whatever classes of thing P, Q and T denote), the circle P lies inside the circle Q which lies within the circle T. Hence the circle P must lie inside the circle T; i.e. all Ps are Ts and hence the conclusion of (S) must be true if the premises are true. As an exercise, construct an Euler diagram which will similarly demonstrate the validity of the syllogism All Ps are Qs. Some Ps are Ts. ∴ Some Qs are Ts. While the method of Euler diagrams is fine for evaluating the relatively restricted class of syllogisms, it is quite inadequate for dealing with inferences in which the information is not about simple class inclusions, intersections, complements, etc. Consider, for example, the following inference: (*) Some positive integer is less than or equal to every positive integer. Therefore, for every positive integer, there is one less than or equal to it. (*) is deductively valid, but it cannot be shown to be by an Euler diagram (try it!). We need a more powerful method and in this and the following chapters we shall develop one, called first-order logic. As a first step we need to introduce some new notation, which will take us beyond the propositional languages of Chapter 3 to a class of formal languages, called first-order languages, capable of exhibiting much more of the logical structure of sentences than is possible within propositional languages. These more elaborate languages will still

Introduction

63

Figure 1

include the connectives →, ∧, ∨ and ¬, but to these will be added two other logical operators called the universal and existential quantifiers and a stock of extralogical symbols called predicate and relation symbols, variables and constants. We shall proceed as we did with the propositional languages, by first informally describing the syntax and

Logic with trees

64

semantics of these extended languages, then giving a more precise formal characterisation and finally proving Completeness and Soundness Theorems for an augmented set of tree rules. 2 QUANTIFIERS AND VARIABLES The universal quantifier The premises and conclusion of the syllogism (S) are called universal generalisations. This extensive and important class of statements are assertions to the effect that everything in a domain D of discourse satisfies some condition or other. ‘Every’ and its variant ‘all’ are collectively known as the universal quantifier, and so important is it in modern logic that it has its own special symbol, . But never occurs alone like that when it is used to make an assertion; it is always immediately followed by what is called an individual variable, or simply variable, represented by a lower-case letter drawn from the end of the Roman alphabet, usually x, y or z. So the universal quantifer will always appear in the formalised version of an ordinary-language sentence in the composite form x (or y or z). A definite assertion is made by combining x with a condition on x, which we can represent formally by P(x), thus: xP(x). This is to be read ‘Every individual x in the domain D satisfies the condition P(x)’, or, more simply, ‘For every x in D, P(x) is true’ (note that there is no explicit reference to D in xP(x), just as there is often no explicit reference to the domain of discourse in ordinary speech). xP(x) is called a formula of that language (more precisely, closed formula, but ‘formula’ will do for now) and the occurrence of the variable x following the quantifier x in xP(x) is said to be bound by that quantifier. It is easy to see that xP(x) is true or false just when ‘Everything in D satisfies P’ is xP(x) as representing the logical form of true or false and we shall regard the formula ‘Everything in D satisfies P.’ However, no variable appears explicitly in the English sentence ‘Everything in D satisfies P’; in that case, why use a variable in its formalisation? Why not simply write P, for example? The answer is that there are more complex statements than ‘Everything in D satisfies P’, for which it is at the very least useful, and may be indispensable, to employ variables and possibly more than one, to display clearly their logical structure. For example, try to paraphrase without using variables the statement that for any numbers x, y and z, x.(y+z)=x.y+x.z. You can do so, but not nearly so intelligibly and simply as if you use variables like x, y and z explicitly—which is, of course, why they were introduced into mathematics in the first place (this occurred in the seventeenth century and was one of the preconditions for the explosion of activity in the new mathematical sciences in that century). The great insight of the logical pioneers of the late nineteenth century was that what works so well in mathematics can work equally well in the representation of logical structure itself. etc. all assert that every individual in D satisfies P. Since this does not depend on D or P we can say that in all interpretations they are all true or all

Introduction

65

false. We shall express this by saying that they are all logically equivalent sentences and we shall use the same symbol, ⇔, that we used for truth-functional equivalence to express this fact. The justification for using the same symbol is that two truth-functionally equivalent sentences are clearly logically equivalent, so that truth-functional equivalence is just a subspecies of this more extensive notion. The existential quantifier There is not just one but two types of quantifier in first-order logic, the second being the existential quantifier, symbolised ∃. Like the universal quantifier, it cannot exist alone in sentences but must always be accompanied by a variable to form a composite symbol . is read as saying ‘there is at least one individual x in the domain D for which P(x) is true’. Similar considerations apply here as to the universal quantifier.

is just another way of saying that something in D satisfies the

condition P and so same assertion.

are logically equivalent formalisations of that

Quantifier interdependence There is a very important relationship between the universal and existential quantifiers: either can be expressed in terms of the other and negation. For is true if and only if every individual in D has the property P, i.e. if and only if there is no individual in D which does not have P; but this is exactly what says. Since these biconditionals also hold for any D and any property P, we can infer that is logically equivalent to i.e. . A similar argument to that above, which will be left to the reader as exercise 2 below, shows that . Exercises 1 Suppose the domain is that of human beings, that P(x) says that x is tall and that Q(x) says that x is broad. State in words and without mentioning the variables x and y, what each of the following says: (i) (ii) (iii) (iv) Do

and

2 Explain carefully why

say anything different from (i) and (ii) respectively?

Logic with trees

66

3 RELATIONS A relation is a state of affairs that may or may not hold between individuals. ‘x is less than y’ is a binary, or two-place, relation between numbers; ‘x is the godmother of y’ and ‘x is a sister of y’ are binary relations between people. ‘x is between y and z’ is a threeplace relation between individuals which may be numbers, or people on a seat, or times, or places, while ‘x = (y+z)/w’ is a four-place relation between numbers. Relations of more than four places might seem very arcane objects, not the sorts of things that would crop up much in practical discourse. In fact, they’re commoner than might be thought, especially in the mathematical sciences, where so-called functional relations are described which can hold between enormous numbers of individuals (for example molecules of a gas). In first-order logic, expressions of the form R(x1, x2,…xn) symbolise n-place relations (since n is mentioned explicitly it is convenient to employ numerically subscripted variables here instead of x, y, z,…). In that expression R is called an n-place relation symbol. One-place relations are not what is normally understood by relations at all, but properties or predicates of individuals. These will be symbolised by expressions of the form P(x), Q(x), etc. P, Q, etc. are called predicate symbols. ‘Is green’, ‘is a prime number’, ‘is a nuclear reactor’, etc. are ordinary-language predicate terms. It is important to grasp that R(x, y, z,…) signifies that x, y, z in that order stand in the relation R. It may well be that if x and y in that order stand in a binary relation R then so do y and x. But it may be the case that for some pair of individuals x and y, if x and y in that order stand in R, then y and x definitely do not. For example, if R(x, y) represents the binary relation ‘x is less than y’ in the set N of natural numbers and R(x, y) is true for any pair of values of x and y in N, then R(y, x) is false for those values. It would be impossible to convey this information if the symbolism R(x, y) did not implicitly impose an order on x and y in the way they satisfy R. This notational convention does not, however, prevent us from saying that x and y may stand in some binary relation R (for example the identity relation =) independently of the order in which they are written, for we can express this fact by means of the formula R(x, y)→R(y, x). Let us pause here and look again at the inference (*) (above, p. 64). The premise and conclusion are both true statements about numbers, but at first sight they seem to be logically unrelated true statements. In fact, they are not logically unrelated at all, for (*) is a deductively valid inference: it is impossible for the premise of (*) to be true and the conclusion false. We shall prove this later, but to prepare the way it will be useful to discuss just what it means for it to be impossible for its premise to be true and the conclusion false. The clue lies in the observation that any demonstration of (*)’s validity should not depend on further unspecified information about the nature of the binary relation ‘less than or equal to’. Were it to do so then the truth of the premise, independently of that additional information, would not be sufficient to ensure the truth of the conclusion, which it does. Hence, (*) must remain valid, in the sense of the provisional definition in Chapter 1, if we replace ‘x is less than or equal to y’ (x≤y) by the symbolic representation R(x, y) of a generic binary relation. Nor should (*)’s deductive validity depend on any further unspecified information

Introduction

67

about the nature of natural numbers themselves, which implies that it is valid independently of the domain of the quantifiers. Putting these observations together, we can conclude that showing that (*) is deductively valid means showing that whatever set D is selected as the domain of the quantifiers in the formalisation below and whatever binary relation defined in D is selected as the interpretation of R in D, the premise of

is never true and the conclusion false. These observations go far to redeem the promise, made in Chapter 1, that eventually we would define in a non-circular way the all-important ‘cannot’ in the provisional definition given there of deductively valid inference (‘a valid deductive inference is one whose premises cannot all be true and conclusion false’). That provisional definition can now be updated as follows: an inference is deductively valid if there is no structure consisting of a domain and relations defined in that domain which interpret the relation symbols in the inference, such that in that structure the premises are true and the conclusion is false. Any structure consisting of a domain and relations defined in that domain which interprets a formalised inference we shall, naturally enough, call an interpretation of the inference (we shall elaborate this definition later, but it is good enough for now). An interpretation which makes the premises true and the conclusion false we shall, by analogy with the truth-functional case, call a counterexample to it. When we have added tree rules for the quantifiers to the truth-functional ones of Chapter 4 we shall be in a position to prove by means of a closed tree that there is no counterexample to (*) and the various other inferences cited in this chapter. In the meantime, we need to complete the formal apparatus introduced in this chapter by adding one more item to the formal vocabulary of first-order languages. Suppose we try to formalise the following inference: (**) If Mary is happy then everyone is happy. ∴ If Mary is happy then so is Manfred. In (**) two specific individuals are referred to, Mary and Manfred. We have already borrowed variables from mathematics and we shall now borrow again from it, this time constants, lower-case letters a, b, c,… from the beginning of the Roman alphabet, whose function is to refer to specific individuals in the domain. Using such constants a and b to stand for Mary and Manfred respectively and the predicate symbol M to replace the predicate ‘is happy’, the inference above can be formalised:

The introduction of constants completes the formal vocabulary into which we shall translate, or formalise, ordinary-language sentences. In the remainder of this chapter we shall develop some general rules and strategies for doing this.

Logic with trees

68

Exercises 1 Suppose that the domain is the set of positive integers and that R(x, y) is now the relation ‘x is less than or equal to y’. Explain without mentioning the variables x and y what the following sentences say and whether they are true or not. (i) (ii) (iii) (iv) (v) (vi) 2 Which of (i)–(vi) remain true when R(x, y) is interpreted as ‘x is less than y’ on the same domain? 3 Explain why (i) If P(a) is true in a domain D, then (ii) If P(a) is true in a domain D, then

is true in D. is false in D.

4 FORMALISING ENGLISH SENTENCES How do we know when we have the right, or a right, first-order formalisation of a natural-language sentence? Practice helps, but the following rule is a good one to try: compare the conditions in which the formalised and unformalised sentences are each true, by using informal arguments to see what seems to follow from each and what seems to imply each. This may sound a bit vague and also question-begging given that formalising ordinary discourse is just what is supposed to aid us in seeing what does and does not follow from what. But we should not despair. We already have some logical knowledge and we can use that and the machinery we subsequently develop for cross-checking our guesses. Another good rule is to start with simple examples. The syllogism (S) at the beginning of this chapter is one such. To formalise (S), we have to formalise sentences of the form ‘All Ps are Qs’, where the domain D is not explicit. This at any rate seems straightforward, for another way of stating what is conveyed by ‘All Ps are Qs’ is by means of the universally quantified conditional ‘For any x in D, if x is a P then x is a Q’. Granted this, we can formalise ‘All Ps are Qs’ as and similarly the other sentences in the inference. Hence, letting P represent ‘Cretan’, Q represent ‘liar’ and another predicate symbol T represent ‘wicked’, we obtain the formalised version of (S):

Introduction

69

So far so good. But what about the syllogism in section 1? All Ps are Qs. Some Ps are Ts. ∴ Some Qs are Ts. We know how to deal with the ‘All Ps are Qs’ of the first premise, but what about the ‘Some Ps are Ts’ of the second? Most people’s first thought is to formalise this analogously with ‘All Ps are Qs’, i.e. as being (mis)led by the apparent grammatical similarity of the two types of sentence, where the only difference seems to be in the initial quantifiers ‘Some’ and ‘All’. But is definitely wrong and it is easy to show why. Consider the false sentence ‘There is an even positive integer not divisible by two.’ In the domain of the positive integers, let P be the property of being even and T that of not being divisible by two. Thus we have a statement of the form ‘Some Ps are Ts.’ But is true in the domain of the positive integers and so cannot represent the logical form of ‘There is an even positive integer not divisible by two.’ (It is easy to show that is true. First, we know that 3 is a positive integer which is not even. Let the constant a denote 3. So we know that ¬P(a) is true. Hence ¬P(a)∨T(a) is true, because for any sentences denoted by sentence letters A and B, ¬A∨B is a truthfunctional consequence of A. But ¬A∨B is truth-functionally equivalent to A→B. Hence we know that P(a)→T(a) is true. Define the predicate G(x) to be P(x)→T(x). Thus G(a) is true and hence so is (compare exercise 3(i) above); i.e. is true.) An interesting lesson of this demonstration is that grammatical form is not always a good guide to logical form, for we see that there is more than a quantifier difference between the logical structure of ‘All Ps are Ts’ and ‘Some Ps are Ts.’ So what formula does exhibit the logical structure of ‘Some Ps are Ts’? This is not difficult to answer. ‘Some Ps are Ts’ says that there is at least one P which is also a T, i.e. there is at least one individual x in the domain such that x is a P and x is a T. We can straightforwardly transcribe this statement into our logical notation, whence we obtain the formula The syllogism is therefore rendered:

But now suppose we are asked to formalise the two sentences ‘All Ps are Qs’ and ‘Some

Logic with trees

70

Ps are Qs’ as isolated sentences, (i) subject to the constraint that the domain of the variables is in each case to be the set of Ps (we assume that it is not empty), and (ii) with the domain unspecified. (i) In the domain of Ps, ‘All Ps are Qs’ says that everything is a Q, while ‘Some Ps are Qs’ says that something is a Q. Thus ‘All Ps are Qs’ becomes simply and ‘Some Ps are Qs’ becomes . (ii) The answer is underdetermined. ‘All Ps are Qs’ could be or it could be if you want to make the domain the set of Ps—and there is no reason either implicit or explicit in the question why you should not. Similarly, ‘Some Ps are Qs’ is legitimately either or . If, however, ‘All Ps are Qs’ occurs not as an isolated sentence but in the context of an inference, then the following rule must be observed: the quantified variables must all refer to the same domain throughout, just as we should take the unformalised sentences as referring to the same domain throughout. Thus in the syllogism (S) it would be definitely wrong to render ‘All Cretans are liars’ as taking the domain to be Cretans and Q(x) the predicate ‘is a liar’, since the next premise states something about the members of a different class, that of the liars themselves. In formalising this syllogism, therefore the predicates ‘being a Cretan’, ‘being a liar’ and ‘being wicked’ must all be regarded as predicates defined in a common domain. Now let us try something with a more complex structure. Formalise ‘Some people like everyone who likes them’ subject to the constraints that (a) the domain is one of people only, and (b) the only relation or predicate symbols you are allowed to use are a single binary relation symbol L, where L(x, y) is to be read ‘x likes y’. The following paraphrase is a useful first step: ‘There is at least one person x such that for every person y, if y likes x then x likes y.’ Since we are now considering a domain consisting of people, explicit mention of the fact that x and y are people is unnecessary and we get ‘There is at least one x such that for all y, if y likes x then x likes y.’ Now we can translate term by term, obtaining;

The logical structure of a sentence determines what follows deductively from it. Sometimes, however, that structure may not be made obvious by its vernacular expression, as we noted earlier. A particularly instructive example is found in what grammarians call adverbial constructions. For example, consider the following English sentences: ‘Minerva is thinking deeply’, ‘Matilda is waltzing slowly’ and ‘It is raining heavily.’ Clearly, they respectively imply that Minerva is thinking, that Matilda is waltzing and that it is raining. How are we to formalise the sentences to bring out these logical properties? One’s first answer is likely to be that ‘Matilda is waltzing slowly’ has the form P(a), where a is a constant representing Matilda and P is a predicate symbol representing the property of waltzing slowly. The trouble with this answer is that it is powerless to reveal why ‘Matilda is waltzing’ is a deductive consequence of ‘Matilda is waltzing slowly.’ For

Introduction

71

there is no way to extract from P(a) the information that waltzing is part of P. P itself has no ‘parts’; it is just a letter. Since ‘Matilda is waltzing’ obviously is a consequence of ‘Matilda is waltzing slowly’, we seem justified in inferring that P(a) does not faithfully represent the logical form of ‘Matilda is waltzing slowly.’ A more careful analysis is needed. Let us go back to grammar. Words ending in ‘-ly’, like ‘slowly’, ‘deeply’, ‘heavily’, etc., are adverbs; they qualify verbs, in this case the verbs ‘is walking’, ‘is thinking’ and ‘is raining’. Verbs describe actions or processes, and hence the logical way to parse adverbial sentences is as statements asserting the existence of actions and processes possessing the relevant properties. ‘Minerva is thinking deeply’ gets parsed as ‘There is a process which is a thinking process, which is deep and which is currently being undergone by Minerva’; formally, , where the domain includes processes (however we want to think of these) and T represents the predicate ‘is a thinking process’, D(x), that x has depth in some relevant sense and Q(x) that x is a process currently being undergone by Minerva. It is fairly obvious that (‘Minerva

is

thinking’)

is

a

logical

consequence

of

—we shall soon be able to prove this formally—and so our original problem is solved. The other adverbial sentences above can be dealt with similarly. But some people are wary of a logical analysis that seems to commit them to what they see as a metaphysical position, in this case the claim that actions and processes enjoy real existence. But all the formalisation has done is to make explicit what is implicit in our ordinary speech. For in ordinary speech actions and processes are certainly things to which we assign properties and place in relation to other things. This sort of commitment pervades general usage (‘Actions speak louder than words’, ‘Gluttony is a deadly sin’, etc.), whether we like it or not. But if we don’t, we shouldn’t blame the logical analysis; it merely brings out what is already there. There is another way of analysing the logical structure of adverbial sentences where there exists some scale of measurement of the quantities mentioned. Consider, for example, the sentence ‘The train is moving quickly.’ Physicists would most probably understand a sentence like this as describing the speed, or velocity, as they would term it (velocity is speed in a given direction), at which the train is moving. For them the sentence will therefore say something like ‘there is a velocity r such that v(train) is in that (vague) range of values corresponding to our (vague) concept of going quickly (quickly for trains, that is, not for supersonic aircraft)', where v is the velocity function. We can represent ‘v(train) = k’, where k is a number, as a binary relation V(train, k), where V(a, b) holds between any pair (a, b) of individuals just in case a is a material thing and b is a number measuring the velocity of a. So now we can formalise ‘the train is moving quickly’ as , where the domain consists of numbers and material objects—and maybe more besides; where a denotes the train; and where Q(y) is true for any individual y in the domain just in case y is a number falling in the range ‘quick’ when measuring velocities. Clearly, ‘the train is moving’ is formalised in this style as , which is, as we shall soon be able to show formally, a deductive

Logic with trees

72

consequence of In its intended interpretation refers to a domain containing material objects and the values, whether actual numbers or not, of some scale of measurement. Such ‘mixed’ domains are, if only implicitly, referred to widely in ordinary discourse. Consider, for example, Abraham Lincoln’s celebrated observation that you can fool all of the people some of the time and some of the people all of the time, but you can’t fool all of the people all of the time. Lincoln’s remark refers to both people and times and the domain of its quantifiers must consequently include both types of entity. Since we allow only a single domain for the quantifiers, these subdomains must be embraced within a ‘super-domain’ containing both types of entity, times and persons. These can then be regarded as subsets of the wider domain, distinguished formally by predicate symbols T and P respectively. We can now formalise Lincoln’s utterance as

where F(x, y) represents the binary relation ‘x can be fooled at y’ (we shall assume F(x, y) is simply false when x is not a person or y is not a time). The fact that we can introduce time into the formal discussion in this way means that we can capture within a first-order scheme a very important area of ordinary discourse that might seem otherwise out of our reach: tensed utterances. The following three statements are obviously very different in meaning. ‘Rachel went to the cinema’, ‘Rachel is now going to the cinema’, and ‘Rachel will go to the cinema’; the first is in the past, the second in the present and the third in the future tense. A subtheory of modern formal logic called temporal logic has sprung up in the last half-century or so, which adds primitive temporal operators to the usual battery of logical items, the connectives and quantifiers, in order to formalise sentences such as these. But quantifying over times, as domain objects, achieves just the same end and requires no extension of the logical vocabulary. In the process tensed statements become untensed; indeed, they become essentially timeless. Thus, the first of the three tensed statements about Rachel can be expressed as ‘There is a time t before now (t0) such that at t Rachel goes to the cinema’ and is then readily formalised as

where a is a constant denoting Rachel, S(a, t) says that the person a goes to the cinema at time t, t0 is another constant signifying the present time relative to some method of measuring time, like the usual date and clock one, and R(t, t0) says that t is before t0 according to this standard of measurement. Note that no additional predicates T(t), i.e. ‘t is a time’, or P(a), ‘a is a person’, need be introduced explicitly, since the status of t and a is built into the interpretation of the relation symbols R and S. The formalisation of the remaining two statements about Rachel is left as an exercise. Reasoning about time according to modern physics involves a larger set of relations

Introduction

73

and predicates. These predicates and relations are those of modern mathematics and the logical structure of mathematical reasoning deserves a separate treatment, which we shall consider later. But there is nothing in this sort of reasoning, apart from its complexity, that poses any difficulty of principle in representing it within the framework of a firstorder language. However, there are other constructions in English that pose more of a challenge to first-order formalisation. We have already come across one type, the socalled counterfactual conditionals. Others are modal statements, i.e. assertions of possibility and impossibility and finally statements involving probabilities. All these topics are extensive and have had whole books written on them. Some attempt will be made in the final chapters to discuss them without going to book length to do so. Exercises 1 Explain why, if the domain of ‘All Ps are Qs’ is some set D, the sentence is true if there are no Ps in D. 2 Formalise the following sentences. Take the quantified variables to range over a domain of people, and use the constant a to represent Jane and binary relation symbols B, S, O and Y in such a way that B(x, y) stands for the relation ‘x is a brother of y’, S (x, y) for ‘x is a sister of y’, O(x, y) for ‘x is older than y’ and L(x, y) for ‘x likes y’. (i) Jane has a brother. (ii) Jane has no sisters. (iii) Some people like all people. (iv) Some people are liked by nobody. (v) Nobody is their own brother or sister. (vi) Some people have no brothers. (vii) Some people have no sisters older than them. (viii) Some people have brothers older than them whom they like. (ix) Some people like no one’s brother, but there are sisters of some people who are liked by everybody. (x) Some people like no one who likes themselves. (xi) Everyone likes everyone who likes someone. 3 Formalise the following using the relations, predicates and constants indicated: (i) Minerva is thinking deeply (domain: processes; M(x): x is undergone by Minerva; D (x): x is deep). (ii) Carla got home at 5p.m. yesterday (domain: times and people; S(x, y): x is a person and y is a time and x gets home at y; a: Carla; b: 5p.m. yesterday). (iii) Frank has seen the film and won’t see it again (domain: times and people; R(x, y): x is a time and y is a time and x is before y; S(x, y): x is a time and y is a time and x is the same as y or after y; T(x, y): x is a person and y is a time and x sees the film at y; a: Frank; b: the present time).

Chapter 6 First-order languages: syntax and two more tree rules 1 FIRST-ORDER LANGUAGES In the previous chapter we showed how we could represent more of the logical structure of English sentences, more, that is, than truth-functional structure, in a formal notation containing, besides truth-functional connectives, also predicate symbols and relation symbols of arbitrary numbers of places, variables, constants and quantifiers. These form the basic vocabulary items of a class of formal languages called first-order languages, whose syntax and semantics we shall investigate in this and the following chapters. Syntactically, a first-order language is like a propositional language in that it is the set of all sentences which can be constructed from some class of ‘atomic’ components using a specified set of logical operations. However, there are two important differences: first, the set of connectives is fixed, the same for all first-order languages; and second, the atomic sentences of a first-order language are now not sentence letters, single and indivisible, but themselves constructed from a specified vocabulary of logical and extralogical items. The extralogical items are themselves sub-divided into a ‘descriptive’, or referential, part and a structural part. These categories of vocabulary item are specified as follows (the boldface capital letter L refers to an arbitrary first-order language): (i) L’s logical vocabulary contains the same connectives ∧, ∨, ¬ and → as the standard propositional language of Chapter 3 and in addition the two quantifiers and ∃. (ii) The referential part of L’s extralogical vocabulary consists of a set of predicate and n-place (n>1) relation symbols (how many of each may vary from language to language, though there may be infinitely many of both and there must be at least one predicate symbol if there are no relation symbols and vice versa) and a set (possibly empty) of constants. The exact nature of L’s predicate and relation symbols need not concern us; all the discussion of them is carried out in the metalanguage (Chapter 3, section 2) and in this metalanguage we shall use the capitals P, Q and if necessary also P1, P2,…, Q1, Q2, …etc. to refer to distinct predicate symbols of L. Relation symbols of L will be referred to by capitals R and S and if necessary also R1, R2,…, S1, S2…. The number of places of any relation symbol will be assumed known without needing explicit signalling by means of a dedicated notation. Constants of L will be represented by lower-case letters a, b, c from the beginning of the Roman alphabet and if we run out, a1, a2,…, b1, b2,…, c1, c2, …. (iii) The structural items in L’s extralogical vocabulary are two brackets (), the comma, and an indefinitely large supply of variables. We shall represent distinct variables, as before, by distinct lower-case letters x, y, z,…from the end of the Roman alphabet and by

First-order languages

75

x1, x2,…if we run out of these. Define an expression of L to be any finite string of symbols from L’s vocabulary. Some of these will be ‘meaningful’, like for example, if L contains the predicate symbol P; others will not, like x) xx, xR. We shall now proceed in stages to identify these ‘meaningful’ strings and in particular those of which it can sensibly be said that they are true or false when interpreted in an appropriate domain. A notational convention: in this and the following chapters, italic capitals A, B, C…from the beginning of the Roman alphabet will be used to denote arbitrary expressions. In more precise terminology, A, B, C…are metalinguistic variables ranging over the set of expressions of L; however, like Horace who saw and approved the better and followed the worse, we shall generally continue in the sloppier way to talk about arbitrary expressions, sentences, languages, etc. The potential truth- and falsity-bearing expressions of L are what we are really interested in. By analogy with their informal counterparts these will be called the sentences of L. Rather than defining them directly, it is easier first to take a detour via a larger class of expressions called the formulas of L. Recall from the previous chapter that an English sentence of the form ‘All Ps are Qs’ can be formalised as a universally quantified conditional . We can think of as built up from the basic vocabulary of a first-order language in the following increasingly large ‘pieces’ . These pieces will be called formulas of L and the pieces en route subformulas of the final formula. Like the corresponding class of sentences of a propositional language, the class of formulas of L can be uniquely specified by an inductive definition (cf. Chapter 3, section 3). First we define the class of expressions which are unconditionally formulas of L. These are called the atomic formulas of L and they are all expressions of the form P(t), R (t1,…, tn), where R is an n-place relation symbol, for those values of n>1 such that L has relation symbols of those numbers of places and where t, t1,…, tn are any constants or variables of L. An expression A is now said to be in the class F of formulas of L if and only if A is either (i) an atomic formula of L, or (ii) of the form ¬B, (S∧C), (B∨C), (B→C), , where x is any variable and B and C are formulas of L. Brackets are placed around A∧B, A∨B and A→B in (i) and (ii) so that the subformula structure of each formula in L is determinate: the subformulas of a formula A can be defined explicitly as all the nodes on A’s ancestral tree (this is like the ancestral tree of a sentence in a propositional language, except that xB and ∃xB each have a single vertical branch down to B). As before we shall omit outer brackets in ordinary discussion, writing ‘the formula A→B’ rather than ‘the formula (A→B)’. To aid the eye, we shall sometimes alternate curved and square brackets [] in complex formulas. Where are formulas of L, the quantifiers and ∃x are said to have an initial occurrence in them (they may also have other occurrences in these formulas). The scope of those initially occurring quantifiers and ∃x is in each case said to be the occurrence of the subformula B immediately following each of them. A variable is said to occur in a formula if it appears in that formula at some point other than that immediately following a quantifier; for example, xP(x) has only one occurrence of x. An occurrence

Logic with trees

76

of a variable x in a formula A is said to be free if it is not in the scope of any quantifier ∃x in A. An occurrence of a variable is bound if it is not free, i.e. if it is not in the scope of a quantifier formula

. Thus there are four occurrences of a variable in the

three of which are free and one bound; the second occurrence of y is bound. From the way freedom and bondage for variables are defined, it is clear that every occurrence of a variable in a formula is either free or bound. A formula which has a free occurrence of some variable is said to be open. A formula which is not open is closed. The closed formulas are also called the sentences of L. The sentences of L, so defined, are so called because they will be the expressions of L which can be true or false, depending on the interpretation of the predicate and relation symbols in them. But in that case the definition of F seems definitely over-permissive, for it includes as closed formulas expressions like or even . These ‘sentences’ seem to make very little sense. They are included in F because to exclude them would make for a very complicated definition of formula and, as we shall see in the next chapter, they do in fact make perfectly good sense;

will turn out to say

the same as and the same as P(a). However, such formulas can easily be avoided in practice and we shall not be bothered by them. We end this section by introducing some notational conventions which will be useful in the subsequent discussion. We shall signify by A(x1,…, xn) an arbitrary open formula of L with free occurrences of the variables x1,…, xn. Thus A(x) signifies a formula free in just the one variable x. Where A is any formula and t a constant or variable, A(t/x) signifies the result of substituting t for every free occurrence of x in A; if A has no free occurrence of x we shall regard A(t/x) as just A itself. Where A is known from the context of the discussion to have free occurrences of x and only of x, i.e. where A is A(x), we shall usually write A(t) instead of A(t/x). Exercises In the following assume that the relevant first-order language contains all the constants and predicate and relation symbols mentioned. 1 Explain carefully how by reference to the clauses (i) and (ii) in the definition above of formulas of L, you can determine that the expression is a formula of L. List all its subformulas. 2 What is the scope of the quantifier x in each of the following? (a) (b) (c)

First-order languages

77

(d) (e) 3 All the sentences in question 1 are of the form

.

(i) What is A(a) in each case? (ii) What is B(a/x) in each case? 4 Specify the scope of each occurrence of a quantifier in the formula and also indicate all the free occurrences of each variable. 5 Indicate all free occurrences of a variable in 6 Is the formula R(a, b) open or closed?

.

2 TWO MORE TREE RULES We shall now introduce tree rules for the two quantifiers, in each case by a pair of conjugate diagrams. We shall work backwards to them by supposing that we have generated a finished tree from a set Σ of first-order initial sentences, in which there is an open branch B. Recall that an open branch in a truth-functional truth tree determined a ‘world’ in which all the initial sentences were true; ‘world’ is in quotes because it was really just an assignment of the value T to each literal on the branch. We shall suppose that B also furnishes a ‘world’ in which all the sentences in Σ are true, but in this case one which is a bit more like a world, with a domain of individuals and predicates and relations defined in that domain. It will also be a ‘small world’ in the sense, roughly the same as that which economists give the term, that the only individuals in it will be those named by a constant appearing on B. This B-world is of course an interpretation of the first-order language whose predicate and relation symbols are those of the sentences in Σ. In the B-world a universally quantified sentence in that language is true if A(a) is true for every constant a on is false is true) if B (this is the ‘small world’ assumption), while for some constant c on B A(c) is false (¬A(c) is true). Thus we obtain a pair of unsigned conjugate diagrams for the universal quantifier:

The set-theoretic notation {A(a): a is a constant on B} is read ‘the set of all A(a) where a is a constant on B’. Similarly, an existentially quantified sentence true for some constant c on B, while it is false

is true in the B-world if A(c) is true) if A(a) is false (¬A(a) true)

Logic with trees

78

for every constant on B; and so we have the following pair of conjugate diagrams for the existential quantifer:

Notice that, as with the diagrams for the connectives, we can also read these diagrams downwards, as saying that if the upper sentence in each is true, so are all the lower ones. This is important, because, as in the earlier truth-functional case, it will enable us to interpret a closed tree as signifying that the initial sentences cannot all be true together; in other words, the tree rules represented by these diagrams are sound. Call a sentence of the form

or

a γ sentence and one of the form

or a δ sentence. In the table below we define corresponding sentences γ(a) and δ(a), where a is any constant of L:

γ(a) and δ(a) are called the instantiations of γ and δ with the constant a. We can now collapse the four unsigned quantifier diagrams into two:

Call {γ(a): a is a constant on B} the descendant of γ in (i) and δ(c) the descendant of δ in (ii). We can adopt diagram (ii) as a new tree rule, which we shall call the rule (δ), on the provisional hypothesis that the sentence to which (δ) is applied is a node on some eventually finished open branch which can be identified with B above. Of course, the hypothesis may be false, for all the branches passing through that node may close. But if it is not false we must some-how write into the statement of (δ) that the choice of c must be made in such a way that nothing else on B conflicts with c’s role of satisfying the condition A(x) or ¬ A(x), as the case may be. The following condition turns out to be necessary and sufficient: the constant c in δ(c) must not be one which has already appeared on the branch above the point at which δ(c) is placed. The condition is necessary, because otherwise we could have this:

First-order languages

79

The proof that the condition is sufficient must wait until Chapter 8. Diagram (i) cannot, however, as it stands be used as a tree rule. There is nothing wrong with the descendant of γ being a set of sentences rather than a single sentence: after all, the descendant of an a sentence is a set, {α1, α2}. The problem with (i) is that there may not be a finished open branch in the tree (it may close) and even if there is we may not know, at the stage in its development at which we want to apply (i), which constants are on it. We can’t simply identify the set of constants on a branch, open or closed, with those in the initial sentences, for we now know that new constants not yet appearing may subsequently have to be added by an application of (δ) Fortunately, we can modify (i) to get round these difficulties quite easily, while still remaining in the spirit of the enterprise. We simply allow the instantiations γ(a) of γ to be introduced piecemeal on any branch as these new constants get added (if they do) to it. When and only when γ has been instantiated with every constant on the branch (including one introduced specifically for that purpose if there would otherwise have been none) shall we say that γ is used on that branch. This is by contrast with the other tree rules, where the sentences to which they are applied are used on a branch as soon as their descendants are placed on it. In the light of all these considerations, we can formulate the tree rules (γ) and (δ) as follows (N.B.: B is now the as yet unfinished branch on which the descendant in each case is placed):

As earlier, a tree will be said to be finished when either it closes or every usable sentence on every open branch is used on that branch. This is still not quite the final form of these rules, but it is final enough for our purposes now. Two features of the rules should be noted. First, though the constraints on them are inspired by semantic considerations, the rules themselves are purely formal (syntactical). They can be implemented without reference to any interpretation of the language; all one has to know at the point of applying them are the sentences so far generated on the branch. Second, their validity is not restricted in any way by the ‘small

Logic with trees

80

world’ assumption made at the outset; it will turn out that if a set of first-order sentences is true in any world then it is true in a ‘small’ one and conversely (this is the content of a celebrated result called the Löwenheim-Skolem Theorem, which we shall prove in Chapter 8). To get a feel for how the rules work, we shall construct a closed tree from the initial sentences

:

Two features of this tree deserve comment. (i) There is a clear strategic advantage to applying (δ) before (γ). This is not only because once the (δ) rule has been applied to a sentence that sentence is used once and for all, but also because giving the (δ) rule priority minimises the number of individual constants which have to be introduced. (ii) We have extended the (α) and (β) rules in a natural way to first-order sentences: (β) was applied to the sentence P(a)→Q(a), β1 being ¬P(a) and β2 Q(a). From now on the (α) and (β) rules are applied to any formulas which have the appropriate truth-functional form: for (α), of conjunctions, negations of disjunctions and negations of conditionals; and for (β), of disjunctions, conditionals and negations of conjunctions. It might seem from this example that first-order truth trees are just about as well behaved as trees for sentences in the standard propositional languages. Well behaved they are, as we shall see in due course, but they are not quite as well behaved as the purely truth-functional trees. In the first place, to ensure that trees which can close do close we shall need to impose conditions on the order of application of the tree rules. Second, we now have to contemplate infinite trees, as the following example shows. Suppose we try generating a truth tree from the single initial sentence happens:

. This is what

First-order languages

81

The ‘etc.’ signifies that the tree goes on for ever! Clearly, every time one of the new constants introduced by the application of the (δ) rule instantiates the initial γ sentence , it gives rise to yet another δ sentence, which then introduces another constant and so on ad infinitum. The possibility that infinite trees may be generated from finite sets of initial sentences might seem to introduce an uncontrollable dimension into the theory of first-order truth trees, but actually this is not so. For the single-branched tree above, though infinite, is none the less well behaved enough. It is unambiguously a finished open tree: the initial γ sentence on it is definitely used, according to the criterion laid down earlier, since for every constant on the branch, the instantiation with that constant of the γ sentence appears on the branch. Exercises 1 For each of the following, state whether it is a γ or δ sentence, or neither of these. (a) (b) (c) Q(b)→R(a, b) (d) (e) (f) (g) ¬P(a) 2 For each of the γ and δ sentences in question 1, what are γ(a), δ(a)? 3 In two lines of the apparently closed tree below, generated from the initial sentences and

, a rule has been misapplied. Identify the line and

Logic with trees

82

explain how the rule is misapplied.

4 Identify a domain and a binary relation defined in it, such that in that interpretation both and

are true.

3 TREE PROOFS Extending the terminology of Chapter 4, we shall say that there is a tree proof of a conclusion C from a set Γ of premises if the set of initial sentences consisting of Γ together with ¬C generates a closed tree, using any of the rules of (α)–(δ) and Double Negation. The symbolism Γ├C will mean that there is such a tree proof of C from Γ. If there is such a proof, it follows from the Soundness Theorem that we shall prove later that there is no interpretation of the first-order language in which premises and conclusion are formalised in which those premises are true and the conclusion false; i.e. C is a valid inference from Γ. We shall end this chapter with tree proofs for some of the inferences we discussed earlier, starting with one for the syllogism (S) as formalised in Chapter 5, section 2:

First-order languages

83

That was easy. So is the tree proof for the other syllogism of Chapter 5 (sections 1, 4): All Ps are Qs. Some Ps are Ts. ∴ Some Qs are Ts. which will be left as an exercise. The next tree proof we give is for (*), Chapter 5, sections 1, 3:

We shall end by stating two useful facts about binary relations and proving one of them. Call a binary relation, symbolised by R, reflexive on a domain D if the sentence is true in D. R is irreflexive on D if x) is true in D. R is symmetric on D if is true in D; R is asymmetric on D if is true in D. R is transitive on D if

Logic with trees

84

is true in D. R is intransitive on D if is true in D. The two useful facts are that (i) if R is asymmetric on D it is irreflexive on D and (ii) if R is transitive and irreflexive on D it is asymmetric on D. We shall prove (i) by giving a tree proof of irreflexivity from asymmetry:

(ii) is left as exercise 7 below. Exercises 1 Show that (i)

(ii)

(iii)

(iv)

(v)

(vi)

2 Show that interchanged) and that

and conversely (i.e. with premise and conclusion and conversely.

3 Show that (cf. Exercise 1, Chapter 5, section 4). 4 A set A is a subset of a set B (standardly symbolised A⊆B) if every member of A is a member of B. Consider a domain D consisting of arbitrary things and sets. Let R(x, y) be true in D just when y is a set and x is an element of y (in mathematics textbooks this relation is written x∈y). So we can formalise the statement ‘y is a subset of z’, i.e. ‘for all x, if x is an element of y then x is an element of z’, as It can be proved from the axioms of set theory that there is a unique set which has no members and this is called the empty set (we saw earlier that it is given the conventional symbol Ø); i.e. where a denotes Ø,

is true.

First-order languages

85

Show by a tree proof that set is a subset of every set. 5 Give a tree proof of the inference (**) in Chapter 5, section 3. 6 Show that

i.e. the empty

(i) (ii) (iii) (iv) (v) Show that (iv) and (v) remain true when

is replaced by ∃.

7 Give a tree proof which establishes that every transitive irreflexive binary relation is asymmetric. 8 Let Σ be some set, possibly empty, of sentences. Show that (i) If there is a tree proof of a sentence A from a sentence B together with Σ then there is a tree proof of B→A from Σ. (ii) If there is a tree proof of a sentence C from a sentence A together with Σ and of C from a sentence B together with Σ, there is a tree proof of C from A∨B together with Σ. (iii) If there is a tree proof of a sentence B from A(a) together with Σ, where a does not occur in the tree, then there is a tree proof of B from ∃xA(x) together with Σ.

Chapter 7 First-order languages: semantics 1 INTERPRETATIONS Chapter 5 referred to first-order languages interpreted in some domain. To generate the results of the next chapter we need to be a bit more precise about just what sorts of things interpretations of L are. To prepare the following discussion, suppose that L contains a binary relation symbol R and consider the sentence (i.e. closed formula) of L. As it stands it has no truth-value, because no domain has been specified. But it automatically acquires a truth-value once we specify a domain and a binary relation defined in the domain as the interpretation of R. Now suppose L also contains a constant a. For the sentence ∃xR(x, a) to have a truth-value in that domain a must be made to refer to some individual in the domain. We can generalise these observations as follows: specifying an interpretation of a first-order language will mean specifying a domain and interpretations in that domain of the extralogical vocabulary of that language. We shall use the capital Gothic to refer to a generic interpretation of L. The domain of

will be written

and the binary relation defined in

interpreting R will be

written . An interpretation of L which we shall sometimes use for illustrative purposes is N, whose domain is N, the set {0, 1, 2, 3,…} of natural numbers and in which R N is >; i.e. R(x, y) is interpreted in N as the relation x>y. It is not difficult to see that so interpreted the L-sentence is true, and is false: the former sentence says (in N) that for every natural number there is a greater, which is true, while the second says that some natural number is greater than every natural number, including itself, which is false. However, in Chapter 6 it was shown that the sentences

and

generate a closed tree. From this and the Soundness Theorem proved in Chapter 8 we can conclude that all attempts to find ways of making those two initial sentences jointly true fail: there is no interpretation of L in which those two sentences are true. In particular, if

, there is no binary relation

of natural numbers such

that both those sentences are true in . We should pause at this point to think through the implications of this remark. What exactly does it mean to say ‘there is no binary relation of natural numbers such that …’? What is included in this class? There are many familiar binary relations of natural numbers, for example , ≥ a and identity =. With a little thought we can come up with some less familiar ones. Does ‘all binary relations of natural numbers’ mean merely those for which we currently have names? Surely not. Knowledge develops and our conceptual portfolio develops hand in hand

First-order languages: semantics

87

with it. It would be short-sighted, to say the least, to restrict a theory of deductive consequence to the items in that portfolio at any given time. On the other hand, the notion of an arbitrary binary or for that matter n-place relation on an arbitrary domain sounds so nebulous that to translate it into something concrete and acceptably objective would seem on the face of it a hopeless enterprise. Surprisingly, this turns out to be not at all the case. Indeed, the solution to the problem has been known for almost a century. The germ of the solution lies in a distinction first drawn by logicians centuries ago, between intensions and extensions. The intension of a property, for example being a person, is, roughly speaking, the meaning of the phrase ‘is a person’. The extension is the set of all things, in this case the set of people, that have the property. Similarly, the intension of, say, a binary relation is the meaning of a standard description of it. Its extension requires a little more consideration. Recall from an earlier discussion (Chapter 5, section 3) that implicit in the notation R(x, y) is that x and y in that order stand in the relation represented by R. It therefore seems natural to say that the things of which R(x, y) is true in any domain are ordered pairs of domain elements. We have reached an important point in the discussion, for ordered sets will play a central role in our theory of interpretations of first-order languages and to explain clearly what they are we need to make a brief digression into elementary set theory. In set theory the word ‘set’ unqualified means ‘unordered set’ and it is customary to signify these by using curly brackets {} to enclose the terms denoting their members; we have already used these set brackets in earlier chapters. Thus the unordered pair consisting of Cain and Abel, say, is written {Cain, Abel} and because it is an unordered set, {Cain, Abel}= {Abel, Cain}. But the ordered pair of Cain and Abel, in that order, is written with curved brackets enclosing ‘Cain’ first and ‘Abel’ second, thus: (Cain, Abel). For ordered sets (u, v) and (v, u), (u, v)=(ν, u) if and only if u=v. But Cain ≠ Abel and so (Cain, Abel) ≠ (Abel, Cain). Indeed, the first pair is in the extension of many binary relations (for example, is or was a slayer of) of which the other is not. Another relevant feature which distinguishes ordered pairs from unordered ones is that the set {Cain, Cain} is not a pair at all: the set-theoretical Axiom of Extensionality says that an unordered set is uniquely determined by its members, from which it follows that {Cain, Cain} = {Cain}. By contrast, the ordered set (Cain, Cain) is a genuine pair and indeed there is a familiar binary relation defined in the domain D={Cain, Abel) of whose extension both (Cain, Cain) and (Abel, Abel) are members, that of identity=(nor is this the only one: being the same height as is another). The set of all ordered pairs of members of D thus has four members: (Cain, Cain), (Cain, Abel), (Abel, Cain), (Abel, Abel). Just as there is a set of all ordered pairs of members of D, so there is a set of all ordered triples, quadruples,…and in general n-tuples of members of D, for any positive n (the set of all 1-tuples we can regard simply as D itself). The set of all ordered triples of members of D has 8 members (Cain, Cain, Cain), (Cain, Cain, Abel),…, (Abel, Abel, Abel), of quadruples of members of D has 16 members and of n-tuples of members of D has 2n members. The set of all ordered n-tuples of D is written in set-theoretic notation as Dn; it is also called the nth Cartesian product of D with itself. That ends the set theory. We can now identify the extension of an n-place relation in a domain with the corresponding set of ordered n-tuples of domain elements which stand in

Logic with trees

88

that relation (formally, the notation R(x1,…, xn) suggests that R is a predicate of ntuples). Being sets, extensions seem to be admirably objective in character and also—an important bonus—well understood mathematically. Intensions, by contrast, seem to be just the sort of knowledge-dependent entities that we decided we did not want to base our logical theory on. The appropriate strategy in these circumstances is to apply what the philosophers call Occam’s Razor (allegedly introduced into philosophical debate by the Schoolman William of Occam, Occam’s Razor is the injunction not to multiply entities unnecessarily) and eliminate intensions entirely from consideration, identifying properties and relations straightforwardly with their extensions. The next step is to regard any set of n-tuples of members of some domain as the extension of some relation defined in it. Thereby the apparently nebulous notions of an arbitrary property and of an arbitrary n-place relation defined in a domain D are replaced by a well-understood and objective mathematical concept, an arbitrary set of the appropriate dimension (the dimension of a set of n-tuples is n; a subset of D itself is defined to have dimension 1). We are now in a position to define the notion of an interpretation of an arbitrary first-order language with complete generality; the definition, due originally to the Polish-American logician and mathematician Alfred Tarski, is as follows: An interpretation of a first-order language L is a rule specifying (i) a non-empty set

of individuals, called the domain of the variables of L. We

should recall from our discussion in Chapter 5 that the individuals in are not necessarily physical objects. They can be anything which can be conceptually individuated, concrete or abstract, like processes, actions, numbers, algebraic structures, thoughts, emotions, or what have you. (ii) for each individual constant c of L, a particular member is the individual in

of

. In other words,

named by the constant c of L.N.B.: there is no rule

preventing the same individual in

being the interpretation of more than one

constant, i.e. there is nothing to stop being the same individual in as (this should not worry anybody familiar with the custom of many people giving their offspring more than one name). (iii) for each predicate symbol P of L, a set specifies which individuals in

of individuals in

are to have the property P in

P(c), where c is a constant, will be true in

just in case

(iv) for each n-place relation symbol R of L (n>1), a set individuals in relation in

.

is in

—this subset of . Thus the sentence .

of ordered n-tuples of

specifies those n-tuples of individuals which determine the R-

. Thus, if c1,…, cn are constants and R an n-place relation symbol, R(c1,

…, cn) is true in

just in case the n-tuple (

) is in

.

A consequence of the definition of an n-place relation as a set of ordered n-tuples of

First-order languages: semantics

89

is that the empty set Ø automatically qualifies as an n-place relation. For, as we know from exercise 4, section 3, Chapter 6, Ø is a subset of any set and so qualifies both as a subset of

and as a subset of

all n-tuples of

, the set of

. To assign the empty set as the interpretation

symbol P means that no individuals in

will be Ps in

relation symbol of L, this means that no individuals in

; if

of a predicate

is Ø, where r is any

are R-related in

.

Exercises 1 Show that a tree generated from the set

closes. What does this tell you about the identity of

in any interpretation →¬Q(x)) true?

which makes the sentence 2 Show that a tree generated from the set

closes. What does this tell you about the identity of

if

is true

if

is true

in and is Ø? 3 Show that a tree generated from the set

closes. What does this tell you about the identity of

in and is the entire domain of ? 4 Let D be the set {0, 1}. List all the members of D3, i.e. all the triples or 3-tuples of members of D.

Logic with trees

90

2 FORMULAS AND TRUTH In elementary mathematics we are familiar with open formulas being true for given values of their free variables and false for others. For example, ‘x can even be defined in terms of the operator thus: A>B= (A→B). An alternative way of developing modal propositional logic is to take such an operator > as primitive and define and hence ◊, in terms of it as follows: A=T>A, where T is a tautology. Lewis proposed several inequivalent modal deductive systems, characterised, like H in Chapter 10, by sets of logical axioms and rules of inference and classified as S1–S5, but provided little if any independent semantic underpinning for them. The situation was changed dramatically in the fifties and sixties by Kripke, who provided a uniform semantic framework in terms of which it is possible to prove the completeness of various modal systems with respect to different interpretations of the modal operators within that framework. The heuristic motivation for this so-called Kripke semantics was Leibniz’s view that this world is merely one among a host of other possible worlds and that a statement is necessary if it is true in all possible worlds. Kripke’s innovation was to consider possible worlds as purely formal objects in the domain of a binary relation R called an accessibility relation, and he showed that by imposing stronger or weaker constraints placed on R, different classes of inter-pretation are obtained with respect to which familiar modal deductive systems can be shown to be sound and complete.

Beyond the fringe

159

We shall limit the discussion to modal propositional logic, since most of the interesting recent work done in modal logic is located there. Where R is an accessibility relation on a set W of worlds and L the modal propositional language generated from some set of sentence letters, the pair (W, R) is called a frame for L. An interpretation v in a frame F assigns a truth-value, for each world w in W, to every sentence X in L. First v assigns a truth-value to every sentence letter; write this as v(A, w)=T or F as the case may be. The truth-functional connectives are evaluated in the usual way: v(¬X, w)=T if and only if v (X, w)=F; v(X∨Y, w)= T if and only if v(X, w)=T or v(Y, w)=T, etc. For X= Y, v(X, w) =T if and only if v(Y, w’)=T for all w’ such that R(w, w'); informally, X is true in w if and only if Y is true in every world w' accessible from w. An interpretation v in F is said to be a model in F of X if v(X, w)= T for all w in W. X is said to be valid in F if all interpretations in F are models in F of X. The sentences valid in all frames in which R is reflexive constitute the modal system T. The sentences valid in all interpretations in which R is transitive and reflexive are those of Lewis’s S4. The sentences valid in all interpretations in which R is reflexive, transitive and symmetric, i.e. an equivalence relation, are those of S5. Each of these systems has an equivalent syntactical characterisation: as H-style formalisations, these various sets of sentences are those derivable from suitably chosen logical axioms by means of two rules of inference, modus ponens and Necessitation: if A is derivable, so is A. T’s logical axioms are all instances of (A→B)→( A→ B) and A→A. S4’s are those of R plus A→ A. S5’s are those of S4 plus ◊A→ ◊ A. For further information about traditional modal logic the reader should consult a good introductory text, like that of Chellas 1980, or Hughes and Cresswell 1972 (who also discuss so-called quantified modal logic, the extension of the modal operators to predicate languages). Anyone wishing to keep logic metaphysics-free might cast a wary eye on Kripke’s semantics for modal logic. One response is that sets of possible worlds with weaker and stronger accessibility relations on them are probably best looked on as algebraical structures which furnish a useful mathematical tool for investigating the relation between different modal systems. If this were the end of the story modal logic might by now have ceased to be of much interest to logicians. In fact, interest in it has never been keener, inspired by the fact that various interesting formal deductive systems can be interpreted in an illuminating way in a suitable modal system and vice versa (an interpretation of one formal system in another is a translation-function f which maps sentences of the language of the first system into the sentences of that of the second, in such a way that the theorems of the first system are translated into sentences of the second). For example, Intuitionistic logic is interpretable in S4, the translation exploiting the fact that Kripke models for Intuitionistic logic are closely related to S4 frames. Now Intuitionistic logic is allegedly a logic of constructive provability, which suggests that one way of regarding necessity is as provability from principles themselves regarded as a priori necessary. A fruitful way of exploring this idea is suggested by the following facts (in what follows, ‘A’ will be the numeral in PA for the Gödel number of A): (i) the provability predicate for the first-order Peano Axioms is definable in the language L PA of first-order Peano Arithmetic by a formula Pr(x) (Chapter 11, section 5); (ii) first-order Peano Arithmetic seems to be a prime candidate for the status of an a priori necessary body of knowledge; (iii) if A is provable (from PA) so is Pr(‘A’), mimicking the modal

Logic with trees

160

rule of Necessitation; and (iv) if Pr(‘A→B’) and Pr(‘A’) are provable so is Pr(‘B’), mimicking the fact that if (A→B) and A are modal theorems so is B. (i)–(iv) suggest that a substantial part of basic modal logic is interpretable in PA, with the formula Pr(x) in L PA interpreting the modal box. Some of the very deep results about the structure of provability obtained from the study of this interpretation are collected and clearly explained in Boolos (1993; this work also contains an excellent and very lucid account of the Gödel Incompleteness Theorems). But there are theorems of all the standard modal systems that do not translate into theorems of PA, for there are sentences A such that Pr(‘A’)→A is not a theorem of PA. It is not difficult to show why this is so (the result is originally due to Montague (1963)). First, some background. A famous earlier result of Gödel (known as Gödel’s Diagonal Lemma) is that for any formula F(x) of L PA there is a sentence B such that Pr(‘B’)↔B is provable from PA (we take ↔ to be defined in the usual way). In particular, ¬Pr(‘G’) ↔G is a consequence of PA where G is the ‘undecidable’ sentence of Chapter 11, section 5. A theorem of Löb (Boolos 1993:56) says that if Pr(‘A’)→A is a consequence of PA, for any sentence A, then so is A. Löb’s Theorem and Gödel’s First Incompleteness Theorem jointly imply that if PA is consistent then Pr(‘G’)→G is not provable from PA. Suppose, however, we add that sentence and indeed all instances of Pr(‘A’)→A to PA, since for every sentence letter A, A→A is a valid sentence in every traditional modal system and rightly so if necessity means anything familiar at all. Call the result of making all these additions the system M. M is now easily shown to be inconsistent. For by truth-functional logic ¬Pr(‘G’) is a consequence of Pr(‘G’)→G and ¬Pr(‘G’)↔G. So is G. Hence G is provable in M and so by (iii) above Pr(‘G’) is provable in M. Hence M is inconsistent. This startling result shows that for any language L for which necessity is a predicate of sentences of L, it cannot be one in which (a) is defined by a formula of L, (b) renders all the consequences of first-order Peano Arithmetic necessary truths and (c) satisfies all the most basic modal principles. It is tempting to regard this result as showing that assigning necessity the role of object-language operator, as traditional modal logic does, simply obscures the fact that it is essentially metalinguistic, like truth according to the Metalinguistic Theory. But this conclusion would be premature. Montague’s result does not demonstrate that there is not some sense or senses of necessity which can be expressed in a suitable object-language. Kripke’s semantics shows that there is. There are also more intuitive senses of necessity, like the sort of physical necessity which laws of nature are alleged to possess, for example, in which it is not true that the laws of arithmetic, even if true, are necessary and there have been explicitly modal accounts of physical necessity. There have even been modal interpretations of counterfactuals (Lowe 1983). 3 INDICATIVE CONDITIONALS AND → It has been conceded that the truth-functional → does not in general provide a good interpretation of counterfactual conditionals (at the same time, however, it seems at least open to doubt whether any definite truth-claims are made by counterfactuals, except when they can be interpreted as the conditional predictions made by scientific laws). A

Beyond the fringe

161

more radical objection to the truth-functional → is that it does not adequately represent even the non-subjunctive, or indicative, conditionals used in ordinary speech. Those who believe it does not employ a battery of informal examples which, they claim, are counterexamples to any formalisation using →. The following are a representative sample from the literature. I hope that by the end of their discussion it will be apparent that they are not counter-examples to →. (i) Suppose A is the sentence ‘I add sugar to my coffee’, B is ‘It [my coffee] will taste sweet’ and C is ‘I add diesel to my coffee.’ Here we seem to have a counterexample to the inference ‘If A then B.’ Therefore, if (A and C) then B’, which is deductively valid if the conditional is represented by the truth-functional →. Hence, we are asked to conclude, → does not represent the ordinary English conditional. Answer It is quite extraordinary that this could ever have been regarded as a serious objection, yet it certainly has been. At any rate, it is no counterexample, merely a failure to be explicit. ‘If I add sugar to my coffee it will taste sweet’ is accepted as true only because ‘and I add nothing else’ is tacitly added to the antecedent. What is really being asserted is a statement of the form ‘If A and D then B.’ The inference is therefore one of the form ‘If A and D then B. Therefore if A and not-D then B’, which is truthfunctionally invalid (‘I add diesel’ we can represent, at least for the purposes of the discussion, as being the negation of D). (ii) ‘If it rains then it won’t rain heavily. Therefore if it rains heavily then it won’t rain.’ The premise may be true, but the conclusion seems rather obviously false, contradicting the claim that all inferences of the form ‘If A then B. Therefore if not-B then not-A’ are deductively valid—as they are, of course, when formalised using the truth-functional arrow →. Answer This is slightly, but only slightly, an advance on the previous example. ‘If it doesn’t rain then it won’t rain heavily’ is something we will presumably accept as a necessary truth (the discussion of adverbial constructions, in Chapter 5, shows that when formalised in first-order logic it is a logical truth). From this and the original premise, ‘If it rains then it won’t rain heavily’, we can infer unconditionally, by a step that seems valid enough (the Rule of Dilemma: ‘If A implies B and the negation of A also implies B, then B’), that it won’t rain heavily. From ‘It won’t rain heavily’ we infer ‘If it rains heavily then it won’t rain’ by Absurdity (Chapter 10, section 3) and →-introduction (Chapter 10, section 3). There are people who will reject the answer to (ii) because they reject the Rule of Absurdity. Yet Absurdity seems completely justified by the provisional definition of deductive validity in the first chapter: clearly, if a premise cannot be true then a fortiori it cannot be true and any conclusion false. However, the inference ‘Not-A, therefore if A then B’ which it supports has been questioned for the following reason: (iii) Suppose A is ‘A Democrat will be the next US President’, and B is ‘The next US President will permit racial segregation’ (this is similar to an example in Edgington 1991:180). The problem here is that it seems that we can rationally accept ‘Not-A’ as true (depending on the state of the opinion polls) and no less rationally reject ‘If A then B.’ However, a basic principle of probability theory asserts that the probability of the conclusion of a deductively valid inference is at least as great as that of the conjunction of

Logic with trees

162

the premises (Howson and Urbach 1993:25). As we know, A→B is a truth-functional consequence of ¬A, so if we model the English conditional by →, then, if we accept ‘NotA’ as more probable than not, we are bound to accept ‘If A then B’ as more probable than not. But, in this example, ‘If A then B’ seems almost certainly false, quite independently of the probability of ‘Not-A’, which in the appropriate circumstances might be regarded as probably true. Surely these judgments are not really inconsistent? But if they are not, then it follows that we cannot model the English conditional by →. Answer We shall argue that while ‘If a Democrat will be the next US President then they will permit racial segregation’ seems obviously false, the grounds for judging that it is false do not on analysis support that conclusion. Further discussion will be postponed until after consideration of the next, related, example. (iv) Here is a well-known but too-easy proof of God’s existence. The sole premise is ‘It isn’t true that if God exists then we are free to do as we like’, which seems to be true. But ‘God exists’ is a truth-functional consequence of that premise if the latter is formalised as a negated truth-functional conditional ¬(A→B). Yet nobody in their right mind would believe this inference to God’s existence to be really valid. Answer to (iii) and (iv) The crucial issue in both these last two examples is whether the relevant factual data are properly expressed by asserting the falsity of an English conditional (or equivalently the truth of its negation): in the first example, of the conditional ‘If a Democrat will be the next US President then they will permit racial segregation’ and in the second, ‘If God exists then we are free to do as we like.’ Consider the second first. We presumably feel that our available evidence, as presented in sacred literature and/or the theologians’ interpretation of it, indicates that God’s existing would be incompatible with our freedom to do as we please; i.e. in conjunction with that evidence, ‘God exists’ implies the falsity of ‘We are free to do as we please.’ But then what we have grounds for believing true is not the negated conditional ‘It is not the case that if God exists we are free to do as we please’, but the very different conditional ‘If God exists we are not free to do as we please’; only from the latter is there a clear implication that we are not free to do as we please if God really does exist. Similarly, in (iii), what our evidence directly supports is not the negated conditional ‘It is not the case that if a Democrat will be the next US President then the next US President will reintroduce racial segregation’, but the conditional ‘If a Democrat is the next US President then the next US President will not permit racial segregation.’ In other words, what the evidence directly supports in each case is a conditional of the form ‘If A then not-B.' Nor do there seem to be any further grounds for asserting ‘It is not the case that if A then B.' ‘It is not the case that if A then B' certainly does not follow deductively from ‘If A then not-B’when ‘If ... then—’ is rendered by the truth-functional arrow, nor is there any compelling reason to think it should on any wider consideration. One reading which does make that inference valid is a Lewis-Stalnaker one, in which the conditional is parsed in the same way as either of their counterfactuals. However, even were the Lewis-Stalnaker approach acceptable (and there is one counterfactual that resists their treatment), which it is not if the earlier discussion of it is sound, the conditionals in these examples are not counterfactuals, nor does there seem any good reason why they should be treated as such. It might be objected that in ordinary English we readily make the inference from ‘If A

Beyond the fringe

163

then not-B’ to ‘It is not the case that if A then B.’ The objection to accepting this, even if it were true, which is doubtful, is not only that there appears to be no justification for such an inference, but that to adopt it as a rule would lead to incoherence, as the following example shows. Let A=B=¬(0=0). ‘If ¬(0=0) then 0=0’ seems to be something which, when thought through, is acceptable to anybody who accepts the ordinary theory of identity and the Absurdity Rule. If they also accept contraposition then they should accept ‘If ¬(0=0) then ¬¬(0 = 0)’, i.e. ‘If A then ¬B.’ But they would almost certainly not accept as true ‘It is not the case that if A then B’, since that is the same sentence as ‘It is not the case that if A then A.’ The lesson from this is, I believe, that what we accept and reject in the way of rules should not be based on consideration of an isolated example, but should instead be a decision constrained by more global considerations of how well it contributes to an overall consistent and acceptable theory. But this is to acknowledge the authority of methodological criteria and in particular two widely accepted as constraints on scientific theorising of any sort, generality and coherence. The earlier chapters have shown that many intuitively valid inferences involving conditionals can be successfully analysed using the purely truth-functional →. In the light of this and the previous discussion, it may well be (I actually believe it to be) that the truth-functional conditional is overall the best coherent model of inferences involving indicative conditionals and that where our intuitions conflict with its deliverances those intuitions may simply not offer the best, or even good, guidance. There is nothing bizarre about this suggestion: intuitions are frequently, if reluctantly, ignored in the face of a coherent theory which says that they are wrong. Intuitions about probability, for example, can be notoriously at odds with the theory of probability (a wealth of empirical studies shows just how divergent intuitions are from the theory), yet the broad consensus is that the latter is the sounder judge of what is correct. I suggest, tentatively, that the same is true in the present case and that the advantages which flow from the truth-functional account will eventually be seen to outweigh contrary intuitions. It should be said that the foregoing discussion of the relation between ordinarylanguage conditionals and the truth-functional → is heavily coloured by the author’s own views (though these have a certain amount in common with the theory of assertibility conditions for conditionals due to Grice and Jackson (Jackson 1991). There has been much debate of this topic and the reader is strongly encouraged to examine alternative accounts. Fortunately, there are two excellent anthologies, by Harper, Stalnaker and Pearce (1981) and by Jackson (1991), as well as a recent survey article by Edgington (1995). Sainsbury (1991) contains a thorough and clear discussion, as does Read (1994). 4 CONCLUSION First-order logic has, I believe, a good claim to be regarded as the formal model for an invariant and substantial core of informal deductive reasoning. Its fit is not everywhere perfect, but considering the largely unregulated manner in which human language and reasoning have developed and the variety of purposes to which they are put, a perfect fit is hardly to be expected.

Logic with trees

164

Almost every year, however, brings further claims that first-order logic is ‘dead’. One of the most recent is Devlin’s (1991), as a preamble to presenting his own theory of logic as a sort of very general theory of information-processing. One of the alleged deficiencies is the familiar problem of conditionals and I have argued that on inspection this does not issue in a general condemnation of the truth-functional →. Devlin’s second objection is more fundamental: it is to show that the whole idea of necessary truth-preservation from premises to conclusion is misconceived. His example is the inference (*) Jon walked into the restaurant. He saw that the waitress had dirty hands. So Jon left immediately. This, he claims, would be declared valid by any ordinary person untutored in classical logic, because it is an example of a type of reasoning that ordinary people habitually engage in. Clearly, it is not valid in first-order logic. Indeed, it is not deductively valid according to a much less formal criterion. Nobody who thought for more than a few seconds would say it was a deduction, for the simple and obvious reason that it is possible for Jon to enter the restaurant and see what he saw and yet not leave. He might know that the restaurant possessed an outstanding chef and regard the state of the waitress’s hands as a price worth paying for a superior cuisine. He might have an assignation with the waitress. He might not even like cleanliness. The possibilities consistent with his not leaving are endless. So Devlin’s objection amounts to saying that there are types of reasoning that are not deductive. But we know this; we have already mentioned inductive reasoning, which is the reasoning that we perform when we predict what will happen on the basis of evidence that seems to make such predictions highly likely but not certain. And (*) is a typical example of such reasoning: it is probable reasoning, as the eighteenth-century British empiricist philosophers classified it. But to admit that there is probable, or inductive, reasoning is no reason to deny that there is also deductive, demonstrative reasoning. Which, of course, there is. Nor is there any need to construct a theory of general reasoning which blurs the distinction between those two types. As we have suggested above, there is already a general theory of inductive and deductive reasoning which nevertheless maintains the distinction. Indeed, expounding first-order logic in isolation from the theory of probability is really only telling half the story, for the model of deductive reasoning provided by first-order logic interlocks with probability theory to provide a general account of both inductive and deductive reasoning. Probabilities are a natural and indeed indispensable tool in the theory of non-deductive inference, where evidence supports to a greater or lesser extent some explanatory theory, but does not entail its truth (Howson and Urbach (1993) is an introductory text for a well-known probabilistic account, called the Bayesian theory of uncertain inference). As we have seen in discussion of these examples, probabilistic considerations can be relevant in discussing the adequacy or otherwise of a deductive model and the dovetailing of the theory of probability with that model lends support to both—unfortunately, to an extent we can merely catch glimpses of in this book.

Notation Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

Chapter 6

Notation

Chapter 7

Chapter 9

Chapter 10

Chapter 11

Chapter 12

166

Notation

167

Answers to selected exercises Chapter 1 Section 4

1 False.

2 False.

No.

Section 5

True.

Section 6

1

Section 7

2

Section 8

1 If you want the answer to be ‘yes’ if and only if B is true, the question is ‘Is ¬ (A↔B) true?’

B→A

2 ↔ 3 (i) (ii)

4 ∧ 5 A∧B; A=Untidy work will cost you marks, B=Inaccurate work will cost you marks. 6 No. 7

(A∧B)→(¬C→D)

A=You are over 18. B=You are married.

Answers to selected exercises

169

C=You have already received benefit. D=Your name will go on the register.

Chapter 2 Section 2

1

Section 3

1

Answers to selected exercises

2 (i) The premise is redundant; 3

Section 4

170

(ii) the premises are unsatisfiable. (i)

Tautology.

(ii)

Tautology.

(iii)

Contradiction.

(iv)

Neither.

(v)

Tautology.

(vi)

Neither.

(vii)

Neither.

(viii)

Contradiction.

(ix)

Contradiction.

Chapter 3 Section 1 1

(i)

Two: A, B.

(ii)

Infinitely many.

(iii)

Infinitely many.

2 No. No, it does not mean that there is no biconditional sentence in L[A, B; ∧, ∨, ¬, →], since there are sentences in L truth-functionally equivalent to A↔B. 3 ¬A∨B⇔A→B; ¬(A∧A)∨B⇔A→B; ¬((A∧A)∧A) ⇔ A→B; etc. Section 3

(a)

Answers to selected exercises

(b)

(c)

Section 4

Section 6

1

(a)

A∧B, B∧A

(b)

A∧B, ¬B

(c)

None

(d)

A, B→C

(e)

A→(B→C)

1

2k

2

(i)

(A∧B)∨(¬A∧B)∨(¬A∧¬B)

(ii)

(A∧B)∨(¬A∧B)v(A∧¬B)

171

Answers to selected exercises

172

(iii)

(A∧B)

(iv)

(A∧B)∨(¬A∧¬B)

(v)

(A∧B∨(¬A∧B)∨(A∧¬B)∨(¬A∧¬B)

(vi)

A∧B∧C

(vii)

¬A∧¬B

1

Section 7

2 The shortest are (A|A)|(B|B); (A↓A)↓(B↓B). 3 The shortest X is A|(B|B). The shortest Y is (A↓A)↓B)↓((A↓A)↓B). 4 (i)

Neither.

(ii)

Tautology.

(iii)

Contradiction.

(iv)

Neither.

5 (i) (ii)

A∨B (¬A∧B∧C)∨(¬A∧ ¬B∧C)∨(¬A∧B∧¬C)

6 Because | and ↓ are binary connectives and (A|B)|C is not truth-functionally equivalent to A|(B|C), and (A↓B)↓C is not truth-functionally equivalent to A↓ (B↓C). Section 9*

1 ¬A∨B. 2 (¬A∨B)∧(A∨-B)∧(A∨B). 3 ¬A∨B∨C. 4 A∨¬A.

Chapter 4 Section 2

1

(i)

β

(ii)

Neither.

(iii)

β

Answers to selected exercises

2

Section 3

173

(iv)

Literal.

(v)

α

(vi)

α

(i)

Closed.

(ii)

Open.

(iii)

Closed.

(iv)

Open.

Answers to selected exercises

174

Answers to selected exercises

175

Hence truth-functionally valid (b)

Truth-functionally invalid. Two counterexamples: A-F, B-T, C-T, D-T A-F, B-F, C-T, D-T (c)

Answers to selected exercises

4

(a)

ABC TTT TFT FTT TFF FFT FTF FFF

Chapter 5 Section 1 (i) 2

Everyone is tall.

(ii)

Someone is broad.

(iii)

Someone is tall and broad.

176

Answers to selected exercises (iv)

177

Someone is tall and everyone is broad.

No. Section 1 (i) 3

True: it says that there is a pair of positive integers one of which is less than or equal to the other.

(ii)

True: it makes the same statement as (i).

(iii)

True: it says that every positive integer is less than or equal to itself.

(iv)

True: it says that for every positive integer there is one at least as large.

(v)

True: it says that for every positive integer there is one no greater than it.

(vi)

True: it says that there is a positive integer less than or equal to every positive integer.

2 Only (i) and (ii) remain true; the remainder are false. Section 4 2 (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix)

(x) (xi)

Note: There are sentences equivalent but not identical to each of (i)–(x), and if your answer to any of these is not as given above you should try to see whether it is equivalent to it. 3 (i) (ii) (iii)

S(a, b)

Answers to selected exercises

178

Chapter 6 Section 1 1 2

(a) P(x) (b) P(x)∧Q(x) (c) (d) (e)

3 (i) (a) P(a) (b) P(a)∧Q(a) (c) (d) (e) 6 Closed. Section 1 2

(a) δ (b) δ (c) Neither. (d) Neither. (e) δ (f) Neither. (g) Neither. 2 (a) (b) ¬(Q(a)→R(a, a)) (c) ¬(Q(a)∧P(a)) 3 (δ) is misapplied at lines 5 and 6: b cannot be used to instantiate line 5, since it already appears at line 4. 4 One such domain is N, the set of natural numbers, with R(x, y) interpreted as x

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close