The coordinator of the «Theory of Computation» subject in our university
wants to create a website that automatically evaluates exercises on
contextfree grammars (CFGs). The idea is that, for each exercise, the teacher
simply prepares a reference solution CFG
$G_1$, and then, any CFG
$G_2$
submitted by a student is automatically compared against
$G_1$ to determine
whether they are equivalent (i.e., whether
$G_1$ and
$G_2$ generate the same
language). Unfortunately, it is not possible to create such an evaluator, as
equivalence of CFGs is an undecidable problem.
Nevertheless, it is possible to implement a less ambitious automatic evaluator
as follows: for each length
$\ell$ (up to a certain maximum), test whether
there exists a word
$w$ with length
$\ell$ that is generated by one of
$G_1,G_2$ but not by the other. Clearly, when such word
$w$ exists, then
$w$ is
a
counterexample to the equivalence of
$G_1$ and
$G_2$. When no such
$w$
exists, either
$G_1$ and
$G_2$ are indeed equivalent, or we did not test with
an
$\ell$ big enough to find a counterexample. There are many possible ways to
implement an algorithm that, given
$G_1,G_2$ and
$\ell$, looks for such
counterexample
$w$; here we consider one possible technique.
Solve the following exercise by means of a reduction to
SAT:
 Given a natural $\ell>0$ and two CFGs $G_1,G_2$ over an alphabet
$\Sigma$, determine whether there exists a word $w\in\Sigma^*$ with length
$\ell$ that is generated by one of the grammars but not by the other.
We call such $w$ the counterexample to the equivalence of $G_1$ and
$G_2$.
To simplify the setting, we assume that the grammars are normalized in the
following sense: all nonterminal symbols are useful (i.e., they generate at
least one word, and they can be reached from the start symbol of the grammar),
and all the production rules are either of the form $X\to YZ$ or $X\to a$, where
$X,Y,Z$ are nonterminal symbols, and $a$ is a terminal symbol in $\Sigma$.
Note that a CFG normalized like that cannot generate the empty language or the
empty word, and that any production rule of the form $X\to YZ$ generates words
with size at least $2$.
The input of the exercise and the output with the solution (when the input is
solvable) are as follows:

grammars: array [2] of array of array of array of int
length: int
numterminals: int
The input contains the two normalized grammars that must be compared (one in
grammars[0] and the other one in grammars[1]), the specific
length${}>0$ that the counterexample must have, and the size
numterminals${}>0$ of the alphabet $\Sigma$ of terminal symbols.
All symbols are represented by numbers: nonterminal symbols are represented by
nonnegative numbers $0,1,\ldots$ (where $0$ is always used for the start
symbol of a grammar), and terminal symbols are represented by negative numbers
$1,2,\ldots,{}$numterminals. For each g${}\in\{0,1\}$,
the grammar grammars[g] is an array of arrays of arrays of integers
with the following meaning: for each nonterminal nt of the grammar,
grammars[g][nt] contains the list of all the righthand sides of the
production rules of the form nt${}\to YZ$ or nt${}\to a$, and thus,
for the i’th of such righthand sides (counting from $0$),
grammars[g][nt][i] is an array with either $2$ nonterminal symbols (for
a case like nt${}\to YZ$) or $1$ terminal symbol (for a case like
nt${}\to a$).
For example, given a normalized grammar $S\to ABa,\;A\to a,\,B\to b$, with $S$
being the start symbol, we could use the encoding $S=0$, $A=1$, $B=2$, $a=1$,
$b=2$ and represent the grammar as follows:
Original Representation
 
S > AB  a [[[1 2] [1]]
A > a [[1]]
B > b [[2]]]
Note that the lefthand sides of the production rules are implicitly
represented by the index into the outermost array (in this case, indexes $0$,
$1$, $2$, that represent $S$, $A$, $B$, respectively), that each of the items
in this outermost array is a list with the righthand sides that correspond
to that implicit lefthand side, and that each of such righthand sides is
either a list of $2$ nonterminal symbols (like [1 2] in the example,
that represents $AB$), or $1$ terminal symbol (like [1] and [2]
in the example, that represent $a$ and $b$, respectively).

counterexample: array of int
The output is the counterexample that proves that the grammars are not
equivalent (for words of the specified length), i.e., a word that is generated
by one of the grammars but not by the other. Such word must be represented with
an array with size length whose elements are terminal symbols among
$1,2,\ldots,{}$numterminals.