Gherkin

Gherkin is a parser and compiler for the Gherkin language.

Gherkin is currently implemented for the following platforms (in order of birthday):

.NET -
Java -
JavaScript -
Ruby -
Go -
Python -
C -
Objective-C - Currently not actively tested, requires maintenance
Perl -
PHP -
Dart -
C++ -

The CI will run using the linked workflow when that specific language implementation is changed

The CI will also run for any/all linked workflows when any test data is modified (For example modifying one of the good or bad features / ndjson outputs)

Contributing Translations (i18n)

In order to allow Gherkin to be written in a number of languages, the keywords have been translated into multiple languages. To improve readability and flow, some languages may have more than one translation for any given keyword.

If you are looking to add, update or improve these translations please see CONTRIBUTING.md.

Contributing a Parser Implementation

See CONTRIBUTING.md if you want to contribute a parser for a new programming language. Our wish-list is (in no particular order):

Rust

Usage

Gherkin can be used either through its command line interface (CLI) or as a library.

It is designed to be used in conjunction with other tools such as Cucumber which consumes the output from the CLI or library as Cucumber Messages.

Library

Using the library is the preferred way to use Gherkin since it produces easily consumable AST and Pickle objects in-process without having to fork a CLI process or parse JSON.

The library itself provides a stream API, which is what the CLI is based on. This is the recommended way to use the library as it provides a high level API that is easy to use. See the CLI implementations to get an idea of how to use it.

Alternatively, you can use the lower level parser and compiler. Some usage examples are below:

Java

Path path = Paths.get("../testdata/good/minimal.feature");
GherkinParser parser = GherkinParser.builder().build();
Stream<Envelope> pickles = parser.parse(envelope).filter(envelope -> envelope.getPickle().isPresent());

C#

var parser = new Parser();
var gherkinDocument = parser.Parse(@"Drive:\PathToGherkinDocument\document.feature");

Ruby

require 'gherkin/parser'
require 'gherkin/pickles/compiler'

source = {
  uri: 'uri_of_the_feature.feature',
  data: 'Feature: ...',
  mediaType: 'text/x.cucumber.gherkin+plain'
}

gherkin_document = Gherkin::Parser.new.parse(source[:data])
id_generator = Cucumber::Messages::IdGenerator::UUID.new

pickles = Gherkin::Pickles::Compiler.new(id_generator).compile(gherkin_document, source)

JavaScript

var Gherkin = require("@cucumber/gherkin");
var Messages = require("@cucumber/messages");

var uuidFn = Messages.IdGenerator.uuid();
var builder = new Gherkin.AstBuilder(uuidFn);
var matcher = new Gherkin.GherkinClassicTokenMatcher(); // or Gherkin.GherkinInMarkdownTokenMatcher()

var parser = new Gherkin.Parser(builder, matcher);
var gherkinDocument = parser.parse("Feature: ...");
var pickles = Gherkin.compile(
  gherkinDocument,
  "uri_of_the_feature.feature",
  uuidFn
);

Go

// Download the package via: `go get github.com/cucumber/gherkin/go/v27`
//   && go get "github.com/cucumber/messages/go/v22"
import (
  "strings"

  gherkin "github.com/cucumber/gherkin/go/v27"
  messages "github.com/cucumber/messages/go/v22"
)

func main() {
  uuid := &message.UUID{} // or &message.Incrementing{}
  reader := strings.NewReader(`Feature: ...`)
  gherkinDocument, err := gherkin.ParseGherkinDocument(reader, uuid.NewId)
  pickles := gherkin.Pickles(*gherkinDocument, "minimal.feature", uuid.NewId)
}

Python

from gherkin import Compiler, Parser

gherkin_document = Parser().parse("Feature: ...")
gherkin_document["uri"] = "uri_of_the_feature.feature"
pickles = Compiler().compile(gherkin_document)

Objective-C

#import "GHParser+Extensions.h"

GHParser * parser = [[GHParser alloc] init];
NSString * featureFilePath; // Should refer to the place where we can get the content of the feature
NSString * content = [NSString stringWithContentsOfURL:featureFilePath encoding:NSUTF8StringEncoding error:nil];
if([content stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]].length == 0){
      // GHParser will throw an error if you passed empty content... handle this issue first.
}
GHGherkinDocument * result = [parser parseContent:content];

Perl

use Gherkin::Parser;
use Gherkin::Pickles::Compiler;

my $parser = Gherkin::Parser->new();
my $gherkin_document = $parser->parse("Feature: ...");
my $pickles = Gherkin::Pickles::Compiler->compile($gherkin_document);

PHP

use Cucumber\Gherkin\GherkinParser;

$path = '/path/to/my.feature';

$parser = new GherkinParser();
$pickles = $parser->parseString(uri: $path, data: file_get_contents($path));

CLI

The Gherkin CLI gherkin reads Gherkin source files (.feature files) and outputs ASTs and Pickles.

The gherkin program takes any number of files as arguments and prints the results to STDOUT as Newline Delimited JSON.

Each line is a JSON document that conforms to the Cucumber Event Protocol.

To try it out, just install Gherkin for your favourite language, and run it over the files in this repository:

gherkin testdata/**/*.feature

Ndjson is easy to read for programs, but hard for people. To pretty print each JSON document you can pipe it to the jq program:

gherkin testdata/**/*.feature | jq

Table cell escaping

If you want to use a newline character in a table cell, you can write this as \n. If you need a | as part of the cell, you can escape it as \|. And finally, if you need a \, you can escape that with \\.

Architecture

The following diagram outlines the architecture:

graph LR
    A[Feature file] -->|Scanner| B[Tokens]
    B -->|Parser| D[AST]

The scanner reads a gherkin doc (typically read from a .feature file) and creates a token for each line. The tokens are passed to the parser, which outputs an AST (Abstract Syntax Tree).

If the scanner sees a #language header, it will reconfigure itself dynamically to look for Gherkin keywords for the associated language. The keywords are defined in gherkin-languages.json.

The scanner is hand-written, but the parser is generated by the Berp parser generator as part of the build process.

Berp takes a grammar file (gherkin.berp) and a template file (gherkin-X.razor) as input and outputs a parser in language X:

graph TD
    A[gherkin.berp] --> B[berp.exe]
    C[gherkin-X.razor] --> B
    B --> D[Parser.x]

Abstract Syntax Tree (AST)

The AST produced by the parser can be described with the following class diagram:

classDiagram
    ScenarioOutline --|> ScenarioDefinition
    GherkinDocument "1" *-- "0..1" Comment: comment
    GherkinDocument "1" *-- "0..1" Feature: feature
    Feature "1" *-- "0..*" ScenarioDefinition: scenarioDefinitions
    Feature "1" *-- "0..*" Rule: rules
    Rule "1" *-- "0..*" ScenarioDefinition: scenarioDefinitions
    Background "0..1" --* "1" Rule: background
    Feature "1" *-- "0..1" Background: background
    Scenario --|> ScenarioDefinition
    Tag "0..*" --* "1" Feature: tags
    Tag "0..*" --* "1" Rule: tags
    Tag "0..*" --* "1" Scenario: tags
    Tag "0..*" --* "1" ScenarioOutline: tags
    Tag "0..*" --* "1" Examples: tags
    Examples "0..*" --* "1" ScenarioOutline: examples
    TableRow "1" --* "1" Examples: header
    TableRow "0..*" --* "1" Examples: rows
    Background "1" *-- "0..*" Step: steps
    Step "0..*" --* "1" ScenarioDefinition: steps
    StepArgument "0..1" --* "1" Step: stepArgument
    DataTable --|> StepArgument
    StepArgument <|-- DocString
    TableRow "0..*" --* "1" DataTable: rows
    TableRow "1" *-- "0..*" TableCell: cells
    class ScenarioDefinition {
        keyword
        name
        description
    }
    class Step {
        keyword
        text
    }
    class Examples {
        keyword
        name
        description
    }
    class Feature {
        language
        keyword
        name
        description
    }
    class Background {
        keyword
        name
        description
    }
    class Rule {
        keyword
        name
        description
    }
    class DocString {
        content
        contentType
    }
    class Comment {
        text
    }
    class TableCell {
        value
    }
    class Tag {
        name
    }
    class Location {
        line: int
        column: int
    }

Every class represents a node in the AST. Every node has a Location that describes the line number and column number in the input file. These numbers are 1-indexed.

All fields on nodes are strings (except for Location.line and Location.column).

The implementation is simple objects without behaviour, only data. It's up to the implementation to decide whether to use classes or just basic collections, but the AST must have a JSON representation (this is used for testing).

Each node in the JSON representation also has a type property with the name of the node type.

You can see some examples in the testdata/good directory.

Pickles

The AST isn't suitable for execution by Cucumber. It needs further processing into a simpler form called Pickles.

The compiler compiles the AST produced by the parser into pickles:

graph LR
    A[AST] -->|Compiler| B[Pickles]

The rationale is to decouple Gherkin from Cucumber so that Cucumber is open to support alternative formats to Gherkin (for example Markdown).

The simpler Pickles data structure also simplifies the internals of Cucumber. With the compilation logic maintained in the Gherkin library we can easily use the same test suite for all implementations to verify that compilation is behaving consistently between implementations.

Each Scenario will be compiled into a Pickle. A Pickle has a list of PickleStep, derived from the steps in a Scenario.

Each Examples row under Scenario Outline will also be compiled into a Pickle.

Any Background steps will also be compiled into a Pickle.

Every tag, like @a, will be compiled into a Pickle as well (inheriting tags from parent elements in the Gherkin AST).

Example:

@a
Feature:
  @b @c
  Scenario Outline:
    Given <x>

    Examples:
      | x |
      | y |

  @d @e
  Scenario Outline:
    Given <m>

    @f
    Examples:
      | m |
      | n |

Using the CLI we can compile this into several pickle objects:

gherkin testdata/good/readme_example.feature --no-source --no-ast | jq

Output:

{
  "type": "pickle",
  "uri": "testdata/good/readme_example.feature",
  "pickle": {
    "name": "",
    "steps": [
      {
        "text": "y",
        "arguments": [],
        "locations": [
          {
            "line": 9,
            "column": 7
          },
          {
            "line": 5,
            "column": 11
          }
        ]
      }
    ],
    "tags": [
      {
        "name": "@a",
        "location": {
          "line": 1,
          "column": 1
        }
      },
      {
        "name": "@b",
        "location": {
          "line": 3,
          "column": 3
        }
      },
      {
        "name": "@c",
        "location": {
          "line": 3,
          "column": 6
        }
      }
    ],
    "locations": [
      {
        "line": 9,
        "column": 7
      },
      {
        "line": 4,
        "column": 3
      }
    ]
  }
}
{
  "type": "pickle",
  "uri": "testdata/good/readme_example.feature",
  "pickle": {
    "name": "",
    "steps": [
      {
        "text": "n",
        "arguments": [],
        "locations": [
          {
            "line": 18,
            "column": 7
          },
          {
            "line": 13,
            "column": 11
          }
        ]
      }
    ],
    "tags": [
      {
        "name": "@a",
        "location": {
          "line": 1,
          "column": 1
        }
      },
      {
        "name": "@d",
        "location": {
          "line": 11,
          "column": 3
        }
      },
      {
        "name": "@e",
        "location": {
          "line": 11,
          "column": 6
        }
      },
      {
        "name": "@f",
        "location": {
          "line": 15,
          "column": 5
        }
      }
    ],
    "locations": [
      {
        "line": 18,
        "column": 7
      },
      {
        "line": 12,
        "column": 3
      }
    ]
  }
}

Each Pickle event also contains the path to the original source. This is useful for generating reports and stack traces when a Scenario fails.

Cucumber will further transform this list of Pickle objects to a list of TestCase objects. TestCase objects link to user code such as Hooks and Step Definitions.

Building Gherkin

See CONTRIBUTING.md

Markdown with Gherkin

See Markdown with Gherkin.

Name	Name	Last commit message	Last commit date
Latest commit renovate[bot] chore(deps): update dependency @types/node to v22.14.1 Apr 12, 2025 784d11b · Apr 12, 2025 History 3,121 Commits
.github	.github	Replace deprecated `::set-output`	Mar 14, 2025
c	c	Prepare release v32.1.1	Apr 11, 2025
cpp	cpp	Prepare release v32.1.1	Apr 11, 2025
dart	dart	Remove duplicate scenario keyword from sr-Cyrl (#264 )	Aug 9, 2024
dotnet	dotnet	Prepare release v32.1.1	Apr 11, 2025
elixir	elixir	Prepare release v32.1.1	Apr 11, 2025
go	go	Prepare release v32.0.0	Feb 17, 2025
java	java	fix(deps): update dependency org.junit:junit-bom to v5.12.2	Apr 11, 2025
javascript	javascript	chore(deps): update dependency @types/node to v22.14.1	Apr 12, 2025
objective-c	objective-c	Restore License file for each language (#257 )	Jul 17, 2024
perl	perl	Prepare release v32.1.1	Apr 11, 2025
php	php	[All] Allow comment inside descriptions (#334 )	Jan 23, 2025
python	python	Prepare release v32.1.1	Apr 11, 2025
ruby	ruby	Prepare release v32.1.1	Apr 11, 2025
testdata	testdata	Python: Handle empty lines in Description AST node and add test cases…	Mar 27, 2025
.gitignore	.gitignore	Add type annotations to the python codebase (#283 )	Sep 22, 2024
.pre-commit-config.yaml	.pre-commit-config.yaml	Update `pre-commit` hooks to the latest versions. (#326 )	Dec 29, 2024
CHANGELOG.md	CHANGELOG.md	Prepare release v32.1.1	Apr 11, 2025
CONTRIBUTING.md	CONTRIBUTING.md	Fix .NET SDK version needed by berp in CONTRIBUTING.md	Dec 3, 2024
Dockerfile	Dockerfile	chore(deps): update dependency berp to v1.5.0 (#353 )	Jan 12, 2025
LICENSE	LICENSE	Restore License file for each language (#257 )	Jul 17, 2024
MARKDOWN_WITH_GHERKIN.md	MARKDOWN_WITH_GHERKIN.md	Remove react from the monorepo (#1882)	Jan 26, 2022
Makefile	Makefile	Remove references to cucumber/cucumber-build image	Aug 2, 2024
README.md	README.md	Expose Python API as package imports (#352 )	Jan 8, 2025
RELEASING.md	RELEASING.md	Setup building and releasing with Github Actions (#3 )	Nov 8, 2022
gherkin-languages.json	gherkin-languages.json	Remove duplicate scenario keyword from sr-Cyrl (#264 )	Aug 9, 2024
gherkin.berp	gherkin.berp	[All] Allow comment inside descriptions (#334 )	Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Repository files navigation

Gherkin

Contributing Translations (i18n)

Contributing a Parser Implementation

Usage

Library

Java

C#

Ruby

JavaScript

Go

Python

Objective-C

Perl

PHP

CLI

Table cell escaping

Architecture

Abstract Syntax Tree (AST)

Pickles

Building Gherkin

Markdown with Gherkin

Projects using Gherkin

About

Releases 22

Sponsor this project

Used by 3.7k

Contributors 117

Languages

License

cucumber/gherkin

Folders and files

Latest commit

History

Repository files navigation

Gherkin

Contributing Translations (i18n)

Contributing a Parser Implementation

Usage

Library

Java

C#

Ruby

JavaScript

Go

Python

Objective-C

Perl

PHP

CLI

Table cell escaping

Architecture

Abstract Syntax Tree (AST)

Pickles

Building Gherkin

Markdown with Gherkin

Projects using Gherkin

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 22

Sponsor this project

Used by 3.7k

Contributors 117

Languages