Library ll

Load grammar.

LL(1) parsing

Assume some arbitrary LL(1) grammar G.

Variable G : grammar.

Axiom G_LL1 : LL_1 G.

Your goal is to define and prove a parser for LL(1) grammars, assuming that Nullable, First, etc. are computable (which is always the case, but we won't prove it). More precisely, assume some functions that give us productions satisfying some properties based on Nullable, First, etc.

Definition option A (P:A->Prop) := { e | P e } + { forall e, ~ P e }.
A value of type option A P is:
  • either a witness e:A together with a proof of P e,
  • or a proof that no such e exists.
Technically we make use of sig and sumor which is similar but different from sumbool.

Variable find_first : forall a U,
  option _ (fun e => In (U,e) G /\ First G a e).

Variable find_follow : forall a U,
  option _ (fun e => In (U,e) G /\ Follow G a U /\ Nullable G e).

Variable find_end : forall U,
  option _ (fun e => In (U,e) G /\ End G U /\ Nullable G e).

Parsing function

Define a function parse : list tsym -> list sym -> nat -> bool such that, for all words w in the language of s, parse w s n = true for large enough values of n. Of course parse w s n = true should only hold for words w in the language of s.
In Coq's jargon, the extra parameter n is the "fuel" of the parse function. It should decrease at every recursive call, making a Fixpoint definition valid, which would be very difficult to obtain otherwise.
Parsing should proceed based on the first element of each list:
  • parse (a::w) (a::s) f should consume the terminals a, and continue with parse w s (f-1);
  • parse (a::w) (U::s) f should use some reduction U ~> e such that First a e, or such that Nullable e and Follow a U, and continue with parse (a::w) (exp++s) (f-1);
  • parse nil (U::_) f should use some reduction U ~> e when Nullable e and End U;
  • parse nil nil f should return true;
  • other cases should return false.

Require Import Arith.

Fixpoint parse (w : list tsym) (s : list sym) fuel : bool := match fuel with
  | O => false
  | S fuel =>
    match w,s with
    | _, _ => false
    end
end.

Correctness

This part should not be too hard, as the parsing function performs steps allowed by deriv.

Lemma correct : forall fuel w s,
  parse w s fuel = true -> deriv G s (map T w).
Proof.
Admitted.

Completeness

This is significantly more difficult: given a derivation of w, we need to show that parse necessarily follows it, notably selecting the same productions. This relies on the LL(1) property of G. You will need several intermediate lemmas; a few (probably) useful ones are given below for guidance.

Lemma Reachable_cons : forall hd tl, Reachable G (hd :: tl) -> Reachable G tl.
Proof.
  intros hd tl (w,H).
  exists (w ++ hd :: nil).
  rewrite<- app_assoc.
  exact H.
Qed.

Lemma deriv_refl_concat : forall l w e, deriv G l w -> deriv G (e++l) (e++w).
Admitted.

We essentially want to show that, if U ~> exp, deriv (w ++ NT U :: l) s implies deriv (w ++ exp ++ l). The lemma is formulated in an indirect way so that we do not loose information about o when using induction.
Lemma deriv_deep : forall U exp i o, In (U,exp) G ->
  deriv G i o -> forall w l, o = (w ++ NT U :: l) -> deriv G i (w ++ exp ++ l).
Proof.
Admitted.

Lemma Reachable_NT :
  forall U exp l, In (U,exp) G -> Reachable G (NT U :: l) -> Reachable G (exp ++ l).
Proof.
  intros U exp l Hexp (w,HR).
  exists w.
  apply deriv_deep with (U:=U) (exp:=exp) (o := w ++ NT U :: l); auto.
Qed.

Lemma Nullable_app : forall u v, Nullable G (u ++ v) -> Nullable G u /\ Nullable G v.
Proof.
Admitted.

Lemma deriv_app : forall u v t w,
  deriv G (u ++ v) (T t :: w) ->
  (Nullable G u /\ First G t v) \/ First G t u.
Proof.
Admitted.

Theorem complete :
  forall w, deriv G (NT 0 :: nil) (map T w) -> exists f, parse w (NT 0 :: nil) f = true.
Proof.
Admitted.