Skip to content

A parallel program to parse a string of symbols. The inputs are a context-free grammar G in Chomsky Normal Form and a string of symbols. In the end, the program should print yes if the string of symbols can be derived by the rules of the grammar and no otherwise.

Notifications You must be signed in to change notification settings

Amagnum/Parallel-CYK-CFG-Parser-OMP

Repository files navigation

Parallel-CYK-CFG-Parser-OMP

image alt text

Problem Statement:

Write a parallel program to parse a string of symbols. The inputs are a context-free grammar G in Chomsky Normal Form and a string of symbols. In the end, the program should print yes if the string of symbols can be derived by the rules of the grammar and no otherwise. Write a sequential program (no OpenMP directives at all) as well.

Compare the running time of the sequential program with the running time of the parallel program and compute the speedup for different grammars and different string lengths. (OpenMP)

Solution Approach:

For deriving a string which follows certain grammar rules, which are in Chomsky-Normal Form, we can use the CYK algorithm (Cocke–Younger–Kasami algorithm).

It is straight-forward to parallelise the CYK algorithm for multi-core SMP machines using OpenMP. OpenMP programs are C++ programs with pragmas that indicate which loops should be parallelised. The main technical challenges in parallelising the CKY algorithm are synchronising the parallel threads and ensuring that different parallel threads do not interfere with each other. This is achieved by using synchronisation constructs with implicit barriers, thread-private temporary variables and constructs that ensure that updates to shared variables occur as critical operations.

image alt text

fig 1.1 (Eg. Parsing table created during CYK algorithm)

Motivation:

Using parallel programming concepts we can divide our bigger problem into a smaller problem, and the smaller individual problem can be assigned to a single core/processor/thread, here we use OpenMP (multi threading library) to achieve parallel execution. Here we are doing Multi-Threading.

Solution given by OpenMP (Multi-Threading):

  • In Multithreading, a common address space is shared by all the threads. hence we need not to replicate data to each of the other threads, the data remains shared among all the threads.

  • In Multithreading, Process creation is economical, hence it does not use much resources.

  • OpenMP provides solution to handle critical section, by using #pragma omp critical(updateData)

image alt text

Serial Execution code for CYK Algorithm

image alt text

Parallel execution code for CYK Algorithm

How to run?
$ g++ -fopenmp main.cpp -o main
$ ./main ./gram1.in < input1.in

Notes: Create one file gram1.in which contains the grammar, eg:

S->AC|AB

C->SB

A->a

B->b

Create another file input.in which contains all the strings.

3

aacbb

aaaabaaaaaabbbbbbcbbbbbbaaaaaabaaaa

baaaabaaaaaabbbbbbccbbbbbbaaaaaabaaaab

Testing for sample Input (Correctness):

Given CFG (for easier understanding):

S -> aSa | bSb | c

CNF of above grammar (Input to the Program)

S -> BA | DC | c

A -> a

B -> AS

C -> b

D -> CS

Sample Input 1:

aaaabaaaaaabbbbbbcbbbbbbaaaaaabaaaa

Expected Output 1:

Yes

Sample Input 2:

baaaabaaaaaabbbbbbccbbbbbbaaaaaabaaaab

Expected Output 1:

No

Program output:

image alt text

Results:

Comparing the running time of the sequential program with the running time of the parallel program and computing the speedup for different grammars and different string lengths.

The execution time of the problem decreases with increasing the no. of threads, however after increasing the treads count by a certain number of threads, there is no significant improvement in execution time.

Program run Time v/s No of threads

image alt text

Fig: General execution pattern (for all grammars)

Speedup comparisons:

Comparing speedup vs no. of threads for different sizes of data. For large datasets parallel computing usually performs better.

Grammar 1 S->BA|DC|c A->a B->AS C->b D->CS Grammar 2 S->AC|AB C->SB A->a B->b Grammar 3 S->a|XA|AX|b A->RB B->AX|b|a X->a R->XB

String Size ->

No of Threads 5 35 38 50 100 133 900 1200 600
1 0.45640666 0.9907694 1.01579957 0.99248479 0.92442636 0.95771403 1.03116725 1.0370501 1.03671726
2 0.1534068 1.0239146 0.99169445 1.03813392 1.07155521 0.94979691 1.25034979 1.27576512 1.24965535
3 0.07835094 0.59557128 0.72511692 0.87092211 1.01903621 1.0319756 1.25444677 1.2797407 1.24452983
4 0.08268294 0.86761364 0.83336513 0.92842474 1.03267406 1.03629382 1.25441626 1.28088945 1.24374479
5 0.05463936 0.72818829 0.70765348 0.85970993 1.01655306 1.03336775 1.25545664 1.28282309 1.24139685
6 0.0492262 0.69891668 0.69836049 0.84802304 0.98992425 0.99108947 1.25371 1.27728143 1.24207788
7 0.04561572 0.63906515 0.62621847 0.77049791 0.98685954 1.0062568 1.25170224 1.27855614 1.2414088
8 0.03747077 0.61240706 0.59025122 0.74798595 0.97621128 0.9450912 1.25042621 1.27762688 1.23824785
9 0.03370913 0.56964707 0.52830135 0.72378545 0.96146722 0.9939444 1.25078515 1.27982125 1.2397764
10 0.03432386 0.55278812 0.53512065 0.71054016 0.95268262 0.98908832 1.25151823 1.27283151 1.23697332

fig: Speedup table for grammar 3

image alt text

Fig: Grammar 1

image alt text

Fig: Grammar 2

image alt text

Fig: Grammar 3

REFERENCES

  1. Dr. Chandresh Kumar Maurya, Assisant professor, IIT Indore link

About

A parallel program to parse a string of symbols. The inputs are a context-free grammar G in Chomsky Normal Form and a string of symbols. In the end, the program should print yes if the string of symbols can be derived by the rules of the grammar and no otherwise.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages