Parallel-CYK-CFG-Parser-OMP

Problem Statement:

Write a parallel program to parse a string of symbols. The inputs are a context-free grammar G in Chomsky Normal Form and a string of symbols. In the end, the program should print yes if the string of symbols can be derived by the rules of the grammar and no otherwise. Write a sequential program (no OpenMP directives at all) as well.

Compare the running time of the sequential program with the running time of the parallel program and compute the speedup for different grammars and different string lengths. (OpenMP)

Solution Approach:

For deriving a string which follows certain grammar rules, which are in Chomsky-Normal Form, we can use the CYK algorithm (Cocke–Younger–Kasami algorithm).

It is straight-forward to parallelise the CYK algorithm for multi-core SMP machines using OpenMP. OpenMP programs are C++ programs with pragmas that indicate which loops should be parallelised. The main technical challenges in parallelising the CKY algorithm are synchronising the parallel threads and ensuring that different parallel threads do not interfere with each other. This is achieved by using synchronisation constructs with implicit barriers, thread-private temporary variables and constructs that ensure that updates to shared variables occur as critical operations.

fig 1.1 (Eg. Parsing table created during CYK algorithm)

Motivation:

Using parallel programming concepts we can divide our bigger problem into a smaller problem, and the smaller individual problem can be assigned to a single core/processor/thread, here we use OpenMP (multi threading library) to achieve parallel execution. Here we are doing Multi-Threading.

Solution given by OpenMP (Multi-Threading):

In Multithreading, a common address space is shared by all the threads. hence we need not to replicate data to each of the other threads, the data remains shared among all the threads.
In Multithreading, Process creation is economical, hence it does not use much resources.
OpenMP provides solution to handle critical section, by using #pragma omp critical(updateData)

Serial Execution code for CYK Algorithm

Parallel execution code for CYK Algorithm

How to run?

$ g++ -fopenmp main.cpp -o main

$ ./main ./gram1.in < input1.in

Notes: Create one file gram1.in which contains the grammar, eg:

S->AC|AB

C->SB

A->a

B->b

Create another file input.in which contains all the strings.

3

aacbb

aaaabaaaaaabbbbbbcbbbbbbaaaaaabaaaa

baaaabaaaaaabbbbbbccbbbbbbaaaaaabaaaab

Testing for sample Input (Correctness):

Given CFG (for easier understanding):

S -> aSa | bSb | c

CNF of above grammar (Input to the Program)

S -> BA | DC | c

A -> a

B -> AS

C -> b

D -> CS

Sample Input 1:

aaaabaaaaaabbbbbbcbbbbbbaaaaaabaaaa

Expected Output 1:

Yes

Sample Input 2:

baaaabaaaaaabbbbbbccbbbbbbaaaaaabaaaab

Expected Output 1:

No

Program output:

Results:

Comparing the running time of the sequential program with the running time of the parallel program and computing the speedup for different grammars and different string lengths.

The execution time of the problem decreases with increasing the no. of threads, however after increasing the treads count by a certain number of threads, there is no significant improvement in execution time.

Program run Time v/s No of threads

Fig: General execution pattern (for all grammars)

Speedup comparisons:

Comparing speedup vs no. of threads for different sizes of data. For large datasets parallel computing usually performs better.

Grammar 1 S->BA|DC|c A->a B->AS C->b D->CS

Grammar 2 S->AC|AB C->SB A->a B->b

Grammar 3 S->a|XA|AX|b A->RB B->AX|b|a X->a R->XB

String Size ->

No of Threads	5	35	38	50	100	133	900	1200	600
1	0.45640666	0.9907694	1.01579957	0.99248479	0.92442636	0.95771403	1.03116725	1.0370501	1.03671726
2	0.1534068	1.0239146	0.99169445	1.03813392	1.07155521	0.94979691	1.25034979	1.27576512	1.24965535
3	0.07835094	0.59557128	0.72511692	0.87092211	1.01903621	1.0319756	1.25444677	1.2797407	1.24452983
4	0.08268294	0.86761364	0.83336513	0.92842474	1.03267406	1.03629382	1.25441626	1.28088945	1.24374479
5	0.05463936	0.72818829	0.70765348	0.85970993	1.01655306	1.03336775	1.25545664	1.28282309	1.24139685
6	0.0492262	0.69891668	0.69836049	0.84802304	0.98992425	0.99108947	1.25371	1.27728143	1.24207788
7	0.04561572	0.63906515	0.62621847	0.77049791	0.98685954	1.0062568	1.25170224	1.27855614	1.2414088
8	0.03747077	0.61240706	0.59025122	0.74798595	0.97621128	0.9450912	1.25042621	1.27762688	1.23824785
9	0.03370913	0.56964707	0.52830135	0.72378545	0.96146722	0.9939444	1.25078515	1.27982125	1.2397764
10	0.03432386	0.55278812	0.53512065	0.71054016	0.95268262	0.98908832	1.25151823	1.27283151	1.23697332

fig: Speedup table for grammar 3

Fig: Grammar 1

Fig: Grammar 2

Fig: Grammar 3

REFERENCES

Dr. Chandresh Kumar Maurya, Assisant professor, IIT Indore link

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
image_0.png		image_0.png
image_1.png		image_1.png
image_2.png		image_2.png
image_3.png		image_3.png
image_4.png		image_4.png
image_5.png		image_5.png
image_6.png		image_6.png
image_7.png		image_7.png
image_8.png		image_8.png
image_9.png		image_9.png
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

image_0.png

image_0.png

image_1.png

image_1.png

image_2.png

image_2.png

image_3.png

image_3.png

image_4.png

image_4.png

image_5.png

image_5.png

image_6.png

image_6.png

image_7.png

image_7.png

image_8.png

image_8.png

image_9.png

image_9.png

readme.md

readme.md

Repository files navigation

Parallel-CYK-CFG-Parser-OMP

Solution Approach:

Motivation:

Testing for sample Input (Correctness):

Results:

Speedup comparisons:

REFERENCES

About

Releases

Packages

Languages

Amagnum/Parallel-CYK-CFG-Parser-OMP

Folders and files

Latest commit

History

Repository files navigation

Parallel-CYK-CFG-Parser-OMP

Solution Approach:

Motivation:

Testing for sample Input (Correctness):

Results:

Speedup comparisons:

REFERENCES

About

Topics

Resources

Stars

Watchers

Forks

Languages