Skip to content

Instantly share code, notes, and snippets.

@pavly-gerges
Last active June 9, 2025 10:22
Show Gist options
  • Save pavly-gerges/f1e5ff1a157da09f1e92b45d8a562462 to your computer and use it in GitHub Desktop.
Save pavly-gerges/f1e5ff1a157da09f1e92b45d8a562462 to your computer and use it in GitHub Desktop.

Pattern Matching Algorithms

Algorithmic techniques that involve matching a pattern (or domain elements) against another domain (or co-domain) or an argument of domains; therefore, a pattern is a repeated set of some elements (e.g., symbols, characters, or other patterns) in other domains in the same order or in different order.

Normal Forms Technique:

  • A not well-known technique is using predicate and quantificational logic to match a pattern against a domain. Normal Forms (CNF, DNF & PNF) could be utilized for this benefit.
  • Pattern matching can be crossed and used in other algorithmic structures (for example: path finding algorithms by matching some metadata and distance formulas when comparing edges from an adjacency matrix.

Start by an example (i.e., A Miniaturized Scientific Model):

Match ("Word" in "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz0123456789!=><"):

  • Starts by initializing a domain.
  • Define patterns to be matched against the domain.
  • Tests whether the patterns (e.g., "Word") is present in this domain.
  • If present, finds the start index and the end index of the matched pattern.
  • If present, finds the number of repetitions.
  • If repetitions are present, find the start and the end index of each one of them.
  • Group the repetitions in an output collection.

Discrete Logic (Quantificational and Predicate Logic):

  • Matching elements from the domain against their counterparts from the co-domain; such that all elements must be present at least once; could be described using this diagram and this quantificational Prenex Normal Form (PNF), and its equivalent Conjunctive Normal Form (CNF):

$$ \begin{aligned} \text{Let, } P(w_i, a_j) \equiv (w_i == a_j) \text{; where } w \in W,\ a \in A,\ i,j,m,n \in N \\ \therefore, \forall (w) \exists (a) P(w, a) \equiv \bigwedge_{i = 0}^{m} [\bigvee_{j = 0}^{n} P(w_i, a_j)]\\ \equiv [(P(w_0, a_0) \vee P(w_0, a_1) \vee P(w_0, a_2) \vee ... P(w_0, a_n))] \\ \land [(P(w_1, a_0) \vee P(w_1, a_1) \vee P(w_1, a_2) \vee ... P(w_1, a_n))] \\ \land ... [(P(w_m, a_0) \vee P(w_m, a_1) \vee P(w_m, a_2) \vee ... P(w_m, a_n))] \end{aligned} $$

  • Matching elements from the domain against their counterparts from the co-domain; such that all elements must be present at most once (i.e., only and only once); could be described using this diagram and this quantificational Prenex Normal Form (PNF), and its equivalent Conjunctive Normal Form (CNF):

$$ \begin{aligned} \text{Let, } P(w_i, a_j) \equiv (w_i == a_j) \text{; where } w \in W,\ a \in A,\ i,j,m,n \in N \\ \therefore, \forall (w) \exists^{\oplus} (a) P(w, a) \equiv \bigwedge_{i = 0}^{m} [\bigoplus_{j = 0}^{n} P(w_i, a_j)]\\ \equiv [(P(w_0, a_0) \oplus P(w_0, a_1) \oplus P(w_0, a_2) \oplus ... P(w_0, a_n))] \\ \land [(P(w_1, a_0) \oplus P(w_1, a_1) \oplus P(w_1, a_2) \oplus ... P(w_1, a_n))] \\ \land ... [(P(w_m, a_0) \oplus P(w_m, a_1) \oplus P(w_m, a_2) \oplus ... P(w_m, a_n))] \end{aligned} $$

  • Matching elements from the domain against their counterparts from the co-domain; such that none of the elements must be present at all; could be described using this diagram and this quantificational Prenex Normal Form (PNF), and its equivalent CNF:

$$ \begin{aligned} \text{Let, } P(w_i, a_j) \equiv \neg(w_i == a_j) \text{; where } w \in W,\ a \in A,\ i,j,m,n \in N \\ \therefore, \forall (w) \forall (a) P(w, a) \equiv \bigwedge_{i = 0}^{m} [\bigwedge_{j = 0}^{n} P(w_i, a_j)]\\ \equiv [(P(w_0, a_0) \land P(w_0, a_1) \land P(w_0, a_2) \land ... P(w_0, a_n))] \\ \land [(P(w_1, a_0) \land P(w_1, a_1) \land P(w_1, a_2) \land ... P(w_1, a_n))] \\ \land ... [(P(w_m, a_0) \land P(w_m, a_1) \land P(w_m, a_2) \land ... P(w_m, a_n))] \end{aligned} $$

Tip

  • To keep the relation among elements; another predicate function is introduced; a delta function $$\delta: (a_j, a_{j-1}, w_i, w_{i-1}) \rightarrow ((j - k) == (i - l))$$; where $$j,\ i$$ are the indices of the current elements and $$k,\ l$$ are the indices of the antecedents.
  • It's then required to conjoin this delta function with the predicate function $$P(w_i, a_j)$$ into a new function $$P_{\delta}: (w_i, a_j, a_{j-1}, w_{i-1}) \rightarrow ((w_i == a_j) \land ((j - k) == (i - l))$$; where $$j,\ i$$ are the indices of the current elements and $$k,\ l$$ are the indices of the antecedents.

Software Design:

  • Create a Data Structure (e.g., a pattern struct).
  • Create a collection data structure for matched patterns (e.g., a matchable).
  • A domain is also a pattern, but a larger pattern.
  • Pattern struct is an abstract data structure for pattern matching; wherein 2 or 3 or more n-patterns are matched against a domain type.
  • Functions of pattern matching would accept the following:
    • A domain pattern object.
    • A pattern object to be matched.
    • An output matchable object that could hold the result.

[pattern_matching.h]

...
typedef struct pattern (pattern);
typedef struct matchable (matchable);
typedef struct pattern_matching (pattern_matching);

/**
 * @brief A pattern is defined as a structure that wraps a pointer to a
 * a data buffer; and another 2 pointers marking the boundaries of the matchable part.
 * @note The number of elements or the count could be find by performing this delta function 
 * (end_address - start_address). While the start index could be found by performing this delta function 
 * (((long) data) - start_address), and the last index by performing this delta (((long) data) - end_address).
 */
struct pattern {
   void *data;
   long start_address;
   long end_address;
   uint8_t is_domain;
   uint8_t is_matched;
};

/**
 * @brief A matchable is defined as a structure wrapping a pointer to a buffer of the
 * addresses of the matched patterns, another pointer to the domain buffer, and the 
 * number of the matched patterns.
 */
struct matchable {
   pattern **patterns;
   pattern *domain;
   uint64_t count;
};

struct pattern_matching {
    /**
    * @brief Matches all the elements from the first pattern against the second pattern
    * (i.e., the co-domain pattern or the alphabet) and serializes the result into
    * a matchable structure. Returns SUCCESS status code if all the elements from the 
    * domain are found once or multiple times in the co-domain, or FAILURE status code
    * otherwise.
    *
    * @note This examination corrupts with failure if the [compare] address is invalid.
    * 
    * @param pattern a set of elements to try to match against the co-domain.
    * @param pattern the co-domain to match the elements from the first pattern against.
    * @return the status of the pattern matching.
    */
    status_code (*match_all)(pattern *, pattern *, matchable *);

    /**
    * @brief Matches elements from the first pattern (i.e., the domain) against the 
    * second pattern (i.e., the co-domain); such that elements from the domain are 
    * found ONLY ONCE (i.e., no repetition) when compared with their counterpart
    * co-domain space.
    * 
    * @note This examination corrupts with failure if the [compare] address is invalid.
    * 
    * @param pattern a set of elements to try to match against the co-domain.
    * @param pattern the co-domain to match the elements from the first pattern against.
    * @return the status of the pattern matching.
    */
    status_code (*match_once)(pattern *, pattern *, matchable *);

    /**
    * @brief Matches elements from the first pattern (i.e., the domain) against the 
    * second pattern (i.e., the co-domain); such that none of the elements from the 
    * domain are found in their counterpart co-domain space.
    *
    * @note This examination corrupts with failure if the [compare] address is invalid.
    *
    * @param pattern a set of elements to try to match against the co-domain.
    * @param pattern the co-domain to match the elements from the first pattern against.
    * @return the status of the pattern matching.
    */
    status_code (*match_none)(pattern *, pattern *, matchable *);

    /**
    * Compares a literal of a pattern (i.e., domain) against another literal of another 
    * pattern (i.e., co-domain). Data is referenced by a data pointer; a start address, and an 
    * end address. If those addresses coincides, then it's only one element that needs to be 
    * matched. Override to provide a polymorphic action, there is no default action.
    * Returns ASSERTION_SUCCESS status code if comparison is sane, ASSERTION_FAILURE otherwise.
    * 
    * @param pattern a set of elements to try to match against the co-domain.
    * @param pattern the co-domain to match the elements from the first pattern against.
    * @return the status of the pattern matching.
    */
    status_code (*compare)(pattern, pattern);
};

status_code init_pattern_matching(pattern_matching *, status_code (*compare)(pattern, pattern));
...
  • The matching algorithms would be based on Logical Quantification Formula; such that:
    • match_all(pattern *, pattern *, matchable *) would be a CNF algorithm; such that the literals are quantified using an existential Inclusive OR (IOR) $$\exists$$ quantifier.
    • match_once(pattern *, pattern *, matchable *) would be a Conjunctive Form algorithm; such that the literals are quantified using an existential EXOR $$\exists^{\oplus}$$ quantifier.
    • match_none(pattern *, pattern *, matchable *) would be a CNF algorithm; such that the literals are quantified using a Universal Quantifier $$\forall$$ quantifier.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment