FORWARD-CHAINING

make a rule to forward chain when a certain trigger arises


Parent topic:  RULE-CLASSES
Home

See rule-classes for a general discussion of rule classes and how they are used to build rules from formulas. An example :corollary formula from which a :forward-chaining rule might be built is:

Example:
(implies (and (p x) (r x))        when (p a) appears in a formula to be
         (q (f x)))               simplified, try to establish (r a) and
                                  if successful, add (q (f a)) to the
                                  known assumptions

To specify the triggering terms provide a non-empty list of terms as the value of the :trigger-terms field of the rule class object.

General Form:
Any theorem, provided an acceptable triggering term exists.

Forward chaining rules are generated from the corollary term as follows. If that term has the form (implies hyp concl), then the let expressions in concl (formally, lambda expressions) are expanded away, and the result is treated as a conjunction, with one forward chaining rule with hypothesis hyp created for each conjunct. In the other case, where the corollary term is not an implies, we process it as we process the conclusion in the first case.

The :trigger-terms field of a :forward-chaining rule class object should be a non-empty list of terms, if provided, and should have certain properties described below. If the :trigger-terms field is not provided, it defaults to the singleton list containing the ``atom'' of the first hypothesis of the formula. (The atom of (not x) is x; the atom of any other term is the term itself.) If there are no hypotheses and no :trigger-terms were provided, an error is caused.

A triggering term is acceptable if it is not a variable, a quoted constant, a lambda application, a let-expression, or a not-expression, and every variable symbol in the conclusion of the theorem either occurs in the hypotheses or occurs in the trigger.

:Forward-chaining rules are used by the simplifier before it begins to rewrite the literals of the goal. If any term in the goal is an instance of a trigger of some forward chaining rule, we try to establish the hypotheses of that forward chaining theorem (from the negation of the goal). To relieve a hypotheses we only use type reasoning, evaluation of ground terms, and presence among our known assumptions. We do not use rewriting. If all hypotheses are relieved, we add the instantiated conclusion to our known assumptions.

GENERALIZE

make a rule to restrict generalizations


Parent topic:  RULE-CLASSES
Home

See rule-classes for a general discussion of rule classes and how they are used to build rules from formulas. An example :corollary formula from which a :generalize rule might be built is:

Example: any theorem

General Form: any theorem

To use such a :generalize rule, the system waits until it has decided to generalize some term, term, by replacing it with some new variable v. If any :generalize formula can be instantiated so that some non-variable subterm becomes term, then that instance of the formula is added as a hypothesis.

At the moment, the best description of how ACL2 :generalize rules are used may be found in the discussion of ``Generalize Rules,'' pp 248 of A Computational Logic Handbook.

INDUCTION

make a rule that suggests a certain induction


Parent topic:  RULE-CLASSES
Home

Example:
(:induction :corollary t  ; the theorem proved is irrelevant!
            :pattern (* 1/2 i)
            :condition (and (integerp i) (>= i 0))
            :scheme (recursion-by-sub2 i))

In ACL2, as in Nqthm, the functions in a conjecture ``suggest'' the inductions considered by the system. Because every recursive function must be admitted with a justification in terms of a measure that decreases in a well-founded way on a given set of ``controlling'' arguments, every recursive function suggests a dual induction scheme that ``unwinds'' the function from a given application.

For example, since append (actually binary-append, but we'll ignore the distinction here) decomposes its first argument by successive cdrs as long as it is a non-nil true list, the induction scheme suggested by (append x y) has a base case supposing x to be either not a true list or to be nil and then has an induction step in which the induction hypothesis is obtained by replacing x by (cdr x). This substitution decreases the same measure used to justify the definition of append. Observe that an induction scheme is suggested by a recursive function application only if the controlling actuals are distinct variables, a condition that is sufficient to insure that the ``substitution'' used to create the induction hypothesis is indeed a substitution and that it drives down a certain measure. In particular, (append (foo x) y) does not suggest an induction unwinding append because the induction scheme suggested by (append x y) requires that we substitute (cdr x) for x and we cannot do that if x is not a variable symbol.

Once ACL2 has collected together all the suggested induction schemes it massages them in various ways, combining some to simultaneously unwind certain cliques of functions and vetoing others because they ``flaw'' others. We do not further discuss the induction heuristics here; the interested reader should see Chapter XIV of A Computational Logic (Boyer and Moore, Academic Press, 1979) which represents a fairly complete description of the induction heuristics of ACL2.

However, unlike Nqthm, ACL2 provides a means by which the user can elaborate the rules under which function applications suggest induction schemes. Such rules are called :induction rules. The definitional principle automatically creates an :induction rule, named (:induction fn), for each admitted recursive function, fn. It is this rule that links applications of fn to the induction scheme it suggests. Disabling (:induction fn) will prevent fn from suggesting the induction scheme derived from its recursive definition. It is possible for the user to create additional :induction rules by using the :induction rule class in defthm.

Technically we are ``overloading'' defthm by using it in the creation of :induction rules because no theorem need be proved to set up the heuristic link represented by an :induction rule. However, since defthm is generally used to create rules and rule-class objects are generally used to specify the exact form of each rule, we maintain that convention and introduce the notion of an :induction rule. An :induction rule can be created from any lemma whatsoever.

General Form of an :induction Lemma or Corollary: T

General Form of an :induction rule-class: (:induction :pattern pat-term :condition cond-term :scheme scheme-term)

where pat-term, cond-term, and scheme-term are all terms, pat-term is the application of a function symbol, fn, scheme-term is the application of a function symbol, rec-fn, that suggests an induction, and, finally, every free variable of cond-term and scheme-term is a free variable of pat-term. We actually check that rec-fn is either recursively defined -- so that it suggests the induction that is intrinsic to its recursion -- or else that another :induction rule has been proved linking a call of rec-fn as the :pattern to some scheme.

The induction rule created is used as follows. When an instance of the :pattern term occurs in a conjecture to be proved by induction and the corresponding instance of the :condition term is known to be non-nil (by type reasoning alone), the corresponding instance of the :scheme term is created and the rule ``suggests'' the induction, if any, suggested by that term. If rec-fn is recursive, then the suggestion is the one that unwinds that recursion.

Consider, for example, the example given above,

(:induction :pattern (* 1/2 i)
            :condition (and (integerp i) (>= i 0))
            :scheme (recursion-by-sub2 i)).

In this example, we imagine that recursion-by-sub2 is the function:

(defun recursion-by-sub2 (i)
  (if (and (integerp i)
           (< 1 i))
      (recursion-by-sub2 (- i 2))
      t))

Observe that this function recursively decomposes its integer argument by subtracting 2 from it repeatedly and stops when the argument is 1 or less. The value of the function is irrelevant; it is its induction scheme that concerns us. The induction scheme suggested by (recursion-by-sub2 i) is

(and (implies (not (and (integerp i) (< 1 i)))   ; base case
              (:p i))
     (implies (and (and (integerp i) (< 1 i))    ; induction step
                   (:p (- i 2)))
              (:p i)))

We can think of the base case as covering two situations. The first is when i is not an integer. The second is when the integer i is 0 or 1. In the base case we must prove (:p i) without further help. The induction step deals with those integer i greater than 1, and inductively assumes the conjecture for i-2 while proving it for i. Let us call this scheme ``induction on i by twos.''

Suppose the above :induction rule has been added. Then an occurrence of, say, (* 1/2 k) in a conjecture to be proved by induction would suggest, via this rule, an induction on k by twos, provided k was known to be a nonnegative integer. This is because the induction rule's :pattern is matched in the conjecture, its :condition is satisfied, and the :scheme suggested by the rule is that derived from (recursion-by-sub2 k), which is induction on k by twos. Similarly, the term (* 1/2 (length l)) would suggest no induction via this rule, even though the rule ``fires'' because it creates the :scheme (recursion-by-sub2 (length l)) which suggests no inductions unwinding recursion-by-sub2 (since the controlling argument of recursion-by-sub2 in this :scheme is not a variable symbol).

Continuing this example one step further illustrates the utility of :induction rules. We could define the function recursion-by-cddr that suggests the induction scheme decomposing its consp argument two cdrs at a time. We could then add the :induction rule linking (* 1/2 (length x)) to (recursion-by-cddr x) and arrange for (* 1/2 (length l)) to suggest induction on l by cddr.

Observe that :induction rules require no proofs to be done. Such a rule is merely a heuristic link between the :pattern term, which may occur in conjectures to be proved by induction, and the :scheme term, from which an induction scheme may be derived. Hence, when an :induction rule-class is specified in a defthm event, the theorem proved is irrelevant. The easiest theorem to prove is, of course, t. Thus, we suggest that when an :induction rule is to be created, the following form be used:

(defthm name T
  :rule-classes ((:induction :pattern pat-term
                             :condition cond-term
                             :scheme scheme-term)))

The name of the rule created is (:induction name). When that rune is disabled the heuristic link between pat-term and scheme-term is broken.

LINEAR

make some arithmetic inequality rules


Parent topic:  RULE-CLASSES
Home

See rule-classes for a general discussion of rule classes and how they are used to build rules from formulas. An example :corollary formula from which a :linear rule might be built is:

Example:
(implies (and (eqlablep e)           if inequality reasoning begins to
              (true-listp x))        consider how (length (member a b))
         (>= (length x)              compares to any other term, add to
             (length (member e x)))) set of known inequalities the fact
                                     that it is no larger than (length b),
                                     provided (eqlablep a) and (true-listp b)
                                     rewrite to t

General Form:
(and ...
     (implies (and ...hi...)
              (implies (and ...hk...)
                       (and ...
                            (rel lhs rhs)
                            ...)))
     ...)

Note: One :linear rule class object might create many linear arithmetic rules from the :corollary formula. To create the rules, we first flatten the and and implies structure of the formula, transforming it into a conjunction of formulas, each of the form

(implies (and h1 ... hn) (rel lhs rhs))

where no hypothesis is a conjunction and rel is one of the inequality relations <, <=, =, /=, >, or >=. If necessary, the hypothesis of such a conjunct may be vacuuous.

We create a :linear rule for each such conjunct, if possible, and otherwise cause an error. Each rule has one or more ``trigger terms'' which may be specified by the user using the :trigger-terms field of the rule class or which may be defaulted to values chosen by the system. We discuss the determination of trigger terms after discussing how linear rules are used.

:linear rules are used by a linear arithmetic decision procedure during rewriting. We describe the procedure very roughly here. Fundamental to the procedure is the notion of a linear polynomial inequality. A ``linear polynomial'' is a sum of terms, each of which is the product of a rational constant and an ``unknown.'' The ``unknown'' is permitted to be 1 simply to allow a term in the sum to be constant. Thus, an example linear polynomial is 3*x + 7*a + 2; here x and a are the (interesting) unknowns. In our case, the unknowns need not be variable symbols. For example, (length x) might be used as an unknown in a linear polynomial. Thus, another linear polynomial is 3*(length x) + 7*a. A ``linear polynomial inequality'' is an inequality (either < or <=) relation between two linear polynomials. Certain linear polynomial inequalities can be combined by cross-multiplication and addition to permit the deduction of a third linear inequality polynomial with fewer unknowns. If this deduced inequality is manifestly false, a contradiction has been deduced from the assumed inequalities.

For example, suppose we have three assumptions:

p1:  3*x + 7*a <  4 
p2:          3 <  2*x
p3:          0 <= a.

By cross-multiplying and adding the first two (that is, multiplying p1 by 2, p2 by 3 and adding the respective sides), we deduce the intermediate result

p4:  6*x + 14*a + 9 < 8 + 6*x

which, after cancellation, is

p4:        14*a + 1 <  0.

If we then cross-multiply and add p3 to p4, we get

p5:               1 <= 0,

a contradiction. Thus, we have proved that p1 and p2 imply the negation of p3.

All of the unknowns of a polynomial must be eliminated by cancellation in order to produce a constant inequality. We can choose to eliminate the unknowns in any order. We eliminate them in term-order, largest unknowns first. (See term-order.) Now suppose that this procedure does not produce a contradiction but instead yields a set of nontrivial inequalities. A contradiction might still be deduced if we could add to the set some additional inequalities allowing additional cancellation. That is where :linear lemmas come in. When the set of inequalities has stabilized under cross-multiplication and addition and no contradiction is produced, we search the data base of :linear rules for rules about the unknowns that are candidates for cancellation (i.e., are the largest unknowns in their respective inequalities).

If the trigger term of some :linear rule can be instantiated to yield a candidate for cancellation, we so instantiate that rule, attempt to relieve the hypotheses with general-purpose rewriting, and, if successful, convert the concluding inequality into a polynomial inequality and add it to the set. This may let cancellation continue. See the discussion of ``Linear Arithmetic Rules'' pp 242 of A Computational Logic Handbook for more details.

We now describe how the trigger terms are determined. Most of the time, the trigger terms are not specified by the user and are instead selected by the system. However, the user may specify the terms by including an explicit :trigger-terms field in the rule class, e.g.,

General Form of a Linear Rule Class:
(:LINEAR :COROLLARY formula
         :TRIGGER-TERMS (term1 ... termk))

Each termi must be a term and must not be a variable, quoted constant, lambda application, let-expression or if-expression. In addition, each termi must be such that if all the variables in the term are instantiated and then the hypotheses of the corollary formula are relieved (possibly instantiating additional free variables), then all the variables in the concluding inequality are instantiated. We generate a linear rule for each conjuctive branch through the corollary and store each rule under each of the specified triggers. Thus, if the corollary formula contains several conjuncts, the variable restrictions on the termi must hold for each conjunct.

Note: Problems may arise if you explicitly store a linear lemma under a trigger term that, when instantiated, is not the largest unknown in the instantiated concluding inequality. Suppose for example you store the linear rule (<= (fn i j) (/ i (* j j))) under the trigger term (fn i j). Then when the system ``needs'' an inequality about (fn a b), i.e., because (fn a b) is the largest unknown in some inequality in the set of assumed inequalities, it will appeal to the rule and deduce (<= (fn a b) (/ a (* b b))). However, the largest unknown in this inequality is (/ a (* b b)) and hence this inequality will not be used until that term is eliminated by cancellation with some other inequality. It is generally best to specify as a trigger term one of the ``maximal'' terms of the polynomial, as described below.

If :trigger-terms is omitted the system computes a set of trigger terms. Each conjunct of the corollary formula may be given a unique set of triggers depending on the variables that occur in the conjunct and the addends that occur in the concluding inequality. In particular, the trigger terms for a conjunct is the list of all ``maximal addends'' in the concluding inequality.

The ``addends'' of (+ x y) and (- x y) are the union of the addends of x and y. The addends of (- x) and (* n x), where n is a numeric constant, is in all cases just {x}. The addends of an inequality are the union of the addends of the left- and right-hand sides. The addends of any other term, x, is {x}.

A term is maximal for a conjunct (implies hyps concl) of the corollary if (a) the term is a non-variable, non-quote, non-lambda application, non-let and non-if expression, (b) the term contains enough variables so that when they are instantiated and the hypotheses are relieved then all the variables in concl are instantiated, and (c) any instantiation of the term is always at least as ``large'' (see term-order for a description of the order used) as the corresponding instantiations of any other addend in concl.

LINEAR-ALIAS

make a rule to extend the applicability of linear arithmetic


Parent topic:  RULE-CLASSES
Home

See rule-classes for a general discussion of rule classes and how they are used to build rules from formulas. An example :corollary formula from which a :linear-alias rule might be built is as follows. (Of course, the functions fix, plus, diff, and lessp would have to be defined first.)

Example:
(and (equal (fix (plus x y)) (plus x y))
     (equal (fix (diff x y)) (diff x y))
     (equal (fix (fix x)) (fix x))
     (equal (lessp x y) (< (fix x) (fix y)))
     (equal (plus x y) (+ (fix x) (fix y)))
     (implies (not (lessp x y))
              (equal (diff x y) (- (fix x) (fix y)))))

:linear-alias rules are essentially a hack to allow the creation of the "NQTHM" package. We will be surprised if the casual user of ACL2 ever finds a use for them. They are not documented very thoroughly.

General Forms:
(equal lhs rhs)
(implies hyps (equal lhs rhs))

where every variable that occurs in the rule must occur in the lhs. As for :rewrite rules, we strip the conjuncts out of the :corollary formula and generate one :linear-alias rule for each. Thus, it is possible to package together a complete set of :linear-alias rules under a single name.

When linear arithmetic is applied to a term that is not ``linearizable,'' i.e., a term that does not start with <, +, or -, it attempts to transduce the term to a linearizable one by applying all known :linear-alias rules as rewrite rules. This transduction is performed by the simple rewrite function rewrite-with-linear-aliases. Like the standard ACL2 rewriter, the linear alias rewriter applies linear alias rules exhaustively in the sense that if a rule applies then the rhs of the rule is recursively rewritten. Unlike :rewrite rules, :linear-alias rules are applied from outside in (!) and hypotheses are not relieved but merely collected.

For example, rewriting (lessp (diff (plus a b) c) (diff u v)) with the rules generated from the conjunction above produces

(< (+ (+ (fix a) (fix b))
      (- (fix c)))
   (+ (fix u) (- (fix v))))

under the hypothesis that (not (lessp (plus a b) c)) and (not (lessp u v)). (Actually, the + and - operators shown above are really calls of binary-+ and unary--.)

Think of :linear-alias rules as transforming a term from one theory to another, generating hypotheses in the first theory. Call the initial theory ``nqthm'' and the final one ``acl2.'' One should strive to create a set of rules that transform from the nqthm to the acl2 theory while only generating hypotheses in the first theory. The restriction is motivated by the desire not to surprise the nqthm user by suddently introducing conditions that are not phrased in the nqthm theory. The purpose of the nqthm package is to emulate nqthm, so it simply represents a failure if acl2 suddenly produces a case split, say, in which some non-nqthm term appears, such as (rationalp (plus a b)). To this end, we offer some advice below. However, it is especially helpful to study how the outside-in rewrite strategy uses the rules above to achieve this goal.

For example,

  (lessp (plus a b) (diff u v))

first steps to

  (< (fix (plus a b)) (fix (diff u v))),

unconditionally. But plus and diff don't need fixing:

  (< (plus a b) (diff u v)).

Rewriting further finishes with

  (< (+ (fix a) (fix b)) (+ (fix u) (- (fix v)))).

Observe that had we rewritten inside-out the sequence would have been:

  (lessp (+ (fix a) (fix b)) (+ (fix u) (- (fix v))))

and then

  (< (fix (+ (fix a) (fix b))) (fix  (+ (fix u) (- (fix v))))).

Note that we now need to know that + doesn't need fixing if its arguments have been. But that is a more complicated rule than the simple one that plus doesn't need fixing. Perhaps a different set of rules would work well for inside-out rewriting, but I just used the ones that first leapt to mind. It is interesting to note that linearization proceeds outside in too.

Here then is the advice. Rules should either transform nqthm terms to nqthm terms, e.g., (fix (plus x y)) to (plus x y), or should transform nqthm terms to acl2 terms, e.g., (lessp x y) to (< (fix x) (fix y)). When the rhs introduces an acl2 function symbol, the symbol should be outside of all nqthm terms in the rhs. Hypotheses should be expressed in the nqthm theory. No rule should rewrite an acl2 function symbol. If the transformation involves ``fixers'' then each operator should have two rules about it. The first, like (plus x y) = (+ (fix x) (fix y)), explains the nqthm function in terms of an acl2 function with fixed arguments. The second, like (fix (plus x y)) = (plus x y) eliminates the interior fixes generated by the first rule. These conventions mean that the rewriter, by going from outside in, always sees nqthm terms and instantiates the hypotheses of rules with nqthm terms. To violate these conventions might produce a set of rules that would cause a non-nqthm hypothesis to be generated.

Finally, observe that the rules probably will introduce a ``fixer'', which is the function fix in the NQTHM package. The fixer tends to settle around the nonlinear leaves of the resultant linearizable term. It is important to have a :type-prescription rule establishing that the fixer produces a rational.

When a :linear arithmetic rule is proved its conclusion is linearized to obtain the relevant polynomials. :linear-alias rules may kick in there; thus, a conclusion of the form (not (lessp (length str) (strpos pat str))) may be a legal :linear rule. Note however that its linearization is (not (< (fix (length str)) (fix (strpos pat str)))). The maximal term of this rule is (fix (strpos pat str)). However, we do not want to store such a linear rule under fix because that would result in all of the Nqthm proveall's linear rules being stored under the same symbol. We therefore note when the maximal term does not actually occur in the conclusion, is of the form (fix term), for some unary function fix, and term does occur in the conclusion. In that case, we call fix a ``fixer'' and we store a t under the property 'linear-alias-fixerp. We then store the rule with the maximal term term instead of (fix term). Later, when we are adding lemmas to the linear pot, we note that if we seek linear lemmas about (fix term), where fix is a fixer, we look for lemmas both stored under the function symbol of term and stored under the function symbol fix. In all cases, the maximal term of the rule contains as its top-most function symbol the symbol under which the rule is stored. Thus, if we seek lemmas about (fix (strpos pat str)) we will go to the property list of strpos and there find rules about (strpos pat str). We will instantiate their conclusions and linearize them, producing polynomials about (fix (strpos pat str)). This raises a subtle point. It could be that there is more than one fixer used in connection with strpos. Thus, some lemmas could linearize to polynomials about (rat (strpos pat str)). Therefore, when we create the linear rule with (strpos pat str) as its (fixed) maximal term, we store in the :fixer field of the rule the symbol fix to indicate that the linearization of this conclusion will (probably) produce (fix (strpos pat str)) terms. When looking for lemmas about (fix (strpos pat str)) we actually look for lemmas about strpos that have :fixer fix.

Here is an extended example of the use of linear aliasing. First, we define some nqthm-level functions:

(defun fix (x) (if (and (integerp x) (>= x 0)) x 0))
(defun lessp (x y) (< (fix x) (fix y)))
(defun plus (x y) (+ (fix x) (fix y)))
(defun diff (x y) (fix (- (fix x) (fix y))))

Note that (fix x) gets a :type-prescription rule that it is a nonnegative integer.

Now we prove a :linear-alias rule, actually a package of the six rules above.

(defthm nqthm-linear-arithmetic
  (and (equal (fix (plus x y)) (plus x y))
       (equal (fix (diff x y)) (diff x y))
       (equal (fix (fix x)) (fix x))
       (equal (lessp x y) (< (fix x) (fix y)))
       (equal (plus x y) (+ (fix x) (fix y)))
       (implies (not (lessp x y))
                (equal (diff x y) (- (fix x) (fix y)))))
  :rule-classes :linear-alias)

To mimic Nqthm behavior, we need a :rewrite rule to dispatch the cases generated by the case split caused by linearizing diff-expressions. This is not, strictly speaking, part of linear aliasing.

(defthm diff-0 (implies (lessp x y) (equal (diff x y) 0)))

Now we disable the nqthm-level functions. Henceforth, we do not expect to see ACL2-level functions in proofs about these concepts.

(in-theory (disable fix lessp plus diff))

Note that we can now prove such facts as

(thm (implies (and (lessp a b) (lessp b c)) (lessp a c)))
(thm (implies (lessp a (diff b c)) (lessp a b)))

and that the proofs appeal to linear arithmetic!

We can add new aliases (although they get new names). Note that we have to (a) define the function, (b) prove the :linear-alias rule -- probably enabling nqthm-level functions to do it, and (c) disable the new function so it survives rewriting and is around for linearization to de-alias.

(defun add1 (x) (+ (fix x) 1))                       ; (a)
(defthm add1-linear                                  ; (b) - see below
  (and (equal (add1 x) (+ (fix x) 1))
       (equal (fix (add1 x)) (add1 x)))
  :rule-classes :linear-alias
  :hints (("Goal" :in-theory (enable fix))))
(in-theory (disable add1))                           ; (c)
(thm (lessp x (add1 x)))

Observe that in (b) we prove two rules about add1: the one that relates it to + and the one that removes the fixes inserted by the first rule.

Next we illustrate the use of :linear lemmas in the context of linear aliasing. Consider the function:

(defun strpos (pat str)
; Find the position of pat in str.
  (cond ((atom str) 0)
        ((equal pat (car str)) 0)
        (t (add1 (strpos pat (cdr str))))))

(defun strlen (str)
  (cond ((atom str) 0)
        (t (add1 (strlen (cdr str))))))

We can prove and store the following :linear lemma about strpos.

(defthm strpos-length
  (not (lessp (strlen str) (strpos pat str)))
  :rule-classes :linear)

This event is interesting for two reasons. The proof illustrates linear aliasing. But the fact that this event is stored as a :linear lemma -- even though the conclusion is not about < -- also illustrates linear aliasing. In fact, this lemma is stored under (strpos pat str) with :fixer fix.

We can illustrate the storage and retrieval of the lemma directly by proving:

(thm (not (< (fix (strlen str)) (fix (strpos pat str)))))

But more commonly we will be interested in proofs that reach out for facts about strpos (implicitly fixing them for us) as in:

(thm (implies (lessp (strlen str) mx) 
              (lessp (strpos pat str) mx))).

This ends the extended example of linear aliasing lemmas.

META

make a :meta rule (a hand-written simplifier)


Parent topic:  RULE-CLASSES
Home

See rule-classes for a general discussion of rule classes and how they are used to build rules from formulas. Meta rules extend the ACL2 simplifier with hand-written code to transform certain terms to equivalent ones. To add a meta rule, the :corollary formula must establish that the hand-written ``metafunction'' preserves the meaning of the transformed term.

Example :corollary formulas from which :meta rules might be built are:

Examples:
(equal (ev x a)                 ; Modify the rewriter to use fn to
       (ev (fn x) a))           ; transform terms.  The :trigger-fns
                                ; of the :meta rule-class specify
                                ; the top-most function symbols of
                                ; those x that are candidates for
                                ; this transformation.

(implies (and (pseudo-termp x)  ; As above, but this illustrates
              (alistp a))       ; that without loss of generality we
         (equal (ev x a)        ; may restrict x to be shaped like a
                (ev (fn x) a))) ; term and a to be an alist.


(implies (and (pseudo-termp x)  ; As above (with or without the
              (alistp a)        ; hypotheses on x and a) with the
              (ev (hyp-fn x) a)); additional restriction that the
         (equal (ev x a)        ; meaning of (hyp-fn x) is true in
                (ev (fn x) a))) ; the current context.  That is, the
                                ; applicability of the transforma-
                                ; tion may be dependent upon some
                                ; computed hypotheses.

A non-nil list of function symbols must be supplied as the value of the :trigger-fns field in a :meta rule class object.

General Forms:
(implies (and (pseudo-termp x)    ; this hyp is optional
              (alistp a)          ; this hyp is optional
              (ev (hyp-fn x) a))  ; this hyp is optional
         (equiv (ev x a)
                (ev (fn x) a)))

where equiv is a known equivalence relation, x and a are distinct variable names, and ev is an evaluator function (see below), and fn is a function symbol, as is hyp-fn when provided. Both fn and hyp-fn should take a single argument. If fn and/or hyp-fn are guarded, their guards should be (implied by) pseudo-termp. See pseudo-termp. We say the theorem above is a ``metatheorem'' or ``metalemma'' and fn is a ``metafunction'', and hyp-fn is a ``hypothesis metafunction''.

We defer discussion of the case in which there is a hypothesis metafunction and for now address the case in which the other two hypotheses are present.

In the discussion below, we refer to the argument, x, of fn and hyp-fn as a ``term.'' When these metafunctions are executed by the simplifier, they will be applied to (the quotations of) terms. But during the proof of the metatheorem itself, x may not be the quotation of a term. If the pseudo-termp hypothesis is omitted, x may be any object. Even with the pseudo-termp hypothesis, x may merely ``look like a term'' but use non-function symbols or function symbols of incorrect arity. In any case, the metatheorem is stronger than necessary to allow us to apply the metafunctions to terms, as we do in the discussion below. We return later to the question of proving the metatheorem.

Suppose the general form of the metatheorem above is proved with the pseudo-termp and alistp hypotheses. Then when the simplifier encounters a term, (h t1 ... tn), that begins with a function symbol, h, listed in :trigger-fns, it applies the metafunction, fn, to the quotation of the term, i.e., it evaluates (fn '(h t1 ... tn)) to obtain some result, which can be written as 'val. If 'val is different from '(h t1 ... tn) and val is a term, then (h t1 ... tn) is replaced by val, which is then passed along for further rewriting. Because the metatheorem establishes the correctness of fn for all terms (even non-terms!), there is no restriction on which function symbols are listed in the :trigger-fns. Generally, of course, they should be the symbols that head up the terms simplified by the metafunction fn. See term-table for how one obtains some assistance towards guaranteeing that val is indeed a term.

The ``evaluator'' function, ev, is a function that can evaluate a certain class of expressions, namely, all of those composed of variables, constants, and applications of a fixed, finite set of function symbols, g1, ..., gk. Generally speaking, the set of function symbols handled by ev is chosen to be exactly the function symbols recognized and manipulated by the metafunctions being introduced. For example, if fn manipulates expressions in which 'equal and 'binary-append occur as function symbols, then ev is generally specified to handle equal and binary-append. The actual requirements on ev become clear when the metatheorem is proved. The standard way to introduce an evaluator is to use the ACL2 macro defevaluator, though this is not strictly necessary. See defevaluator for details.

Why are we justified in using metafunctions this way? Suppose (fn 'term1) is 'term2. What justifies replacing term1 by term2? The first step is to assert that term1 is (ev 'term1 a), where a is an alist that maps 'var to var, for each variable var in term1. This step is incorrect, because 'term1 may contain function symbols other than the ones, g1, ..., gk, that ev knows how to handle. But we can grow ev to a ``larger'' evaluator, ev*, an evaluator for all of the symbols that occur in term1 or term2. We can prove that ev* satisfies the constraints on ev. Hence, the metatheorem holds for ev* in place of ev, by functional instantiation. We can then carry out the proof of the equivalence of term1 and term2 as follows: Fix a to be an alist that maps the quotations of the variables of term1 and term2 to themselves. Then,

term1 = (ev* 'term1 a)      ; (1) by construction of ev* and a
      = (ev* (fn 'term1) a) ; (2) by the metatheorem for ev*
      = (ev* 'term2 a)      ; (3) by evaluation of fn
      = term2               ; (4) by construction of ev* and a

Note that in line (2) above, where we appeal to the (functional instantiation of the) metatheorem, we can relieve its (optional) pseudo-termp and alistp hypotheses by appealing to the facts that term1 is a term and a is an alist by construction.

Finally, we turn to the second case, in which there is a hypothesis metafunction. In that case, consider as before what happens when the simplifier encounters a term, (h t1 ... tn), where h is listed in :trigger-fns. This time, after it applies fn to '(h t1 ... tn) to obtain the quotation of some new term, 'val, it then applies the hypothesis metafunction, hyp-fn. That is, it evaluates (hyp-fn '(h t1 ... tn)) to obtain some result, which can be written as 'hyp-val. If hyp-val is not in fact a term, the metafunction is not used. Provided hyp-val is a term, the simplifier attempts to establish (by conventional backchaining) that this term is non-nil in the current context. If this attempt fails, then the meta rule is not applied. Otherwise, (h t1...tn) is replaced by val as in the previous case (where there was no hypothesis metafunction).

Why is it justified to make this extension to the case of hypothesis metafunctions? First, note that the rule

(implies (and (pseudo-termp x)
              (alistp a)
              (ev (hyp-fn x) a))
         (equal (ev x a)
                (ev (fn x) a)))

is logically equivalent to the rule

(implies (and (pseudo-termp x)
              (alistp a))
         (equal (ev x a)
                (ev (new-fn x) a)))

where (new-fn x) is defined to be (list 'if (hyp-fn x) (fn x) x). (If we're careful, we realize that this argument depends on making an extension of ev to an evaluator ev* that handles if and the functions manipulated by hyp-fn.) If we write 'term for the quotation of the present term, and if (hyp-fn 'term) and (fn 'term) are both terms, say hyp1 and term1, then by the previous argument we know it is sound to rewrite term to (if hyp1 term1 term). But since we have established in the current context that hyp1 is non-nil, we may simplify (if hyp1 term1 term) to term1, as desired.

We now discuss the role of the pseudo-termp hypothesis. (Pseudo-termp x) checks that x has the shape of a term. Roughly speaking, it insures that x is a symbol, a quoted constant, or a true list consisting of a lambda expression or symbol followed by some pseudo-terms. Among the properties of terms not checked by pseudo-termp are that variable symbols never begin with ampersand, lambda expressions are closed, and function symbols are applied to the correct number of arguments. See pseudo-termp.

There are two possible roles for pseudo-termp in the development of a metatheorem: it may be used as the guard of the metafunction and/or hypothesis metafunction and it may be used as a hypothesis of the metatheorem. Generally speaking, the pseudo-termp hypothesis is included in a metatheorem only if it makes it easier to prove. The choice is yours. (An extreme example of this is when the metatheorem is invalid without the hypothesis!) We therefore address ourselves the question: should a metafunction have a pseudo-termp guard? A pseudo-termp guard for a metafunction, in connection with other considerations described below, improves the efficiency with which the metafunction is used by the simplifier.

To make a metafunction maximally efficient you should (a) provide it with a pseudo-termp guard and exploit the guard when possible in coding the body of the function (see guards-and-evaluation, especially the section on efficiency issues), (b) verify the guards of the metafunction (see verify-guards), and (c) compile the metafunction (see comp). When these three steps have been taken the simplifier can evaluate (fn 'term1) by running the compiled ``primary code'' (see guards-and-evaluation) for fn directly in Common Lisp.

Before discussing efficiency issues further, let us review the for a moment the general case in which we wish to evaluate (fn `obj) for some :logic function. We must first ask whether the guards of fn have been verified. If not, we must evaluate fn by executing its logic definition. This effectively checks the guards of every subroutine and so can be slow. If, on the other hand, the guards of fn have been verified, then we can run the primary code for fn, provided 'obj satisfies the guard of fn. So we must next evaluate the guard of fn on 'obj. If the guard is met, then we run the primary code for fn, otherwise we run the logic code.

Now in the case of a metafunction for which the three steps above have been followed, we know the guard is (implied by) pseudo-termp and that it has been verified. Furthermore, we know without checking that the guard is met (because term1 is a term and hence 'term1 is a pseudo-termp). Hence, we can use the compiled primary code directly.

We strongly recommend that you compile your metafunctions, as well as all their subroutines. Guard verification is also recommended.