Skip to content

Releases: caleb531/automata

v8.3.0

23 Mar 20:19
bbf3d58
Compare
Choose a tag to compare

Notable Changes

  • Added a new DFA.from_substrings method using the Aho-Corasick algorithm (#167)
  • Fixed support for Python 3.12
  • The contributing guide has been reworked with clearer information on how you can contribute to the Automata library!

Miscellaneous

  • Various minor bugfixes in the codebase
  • Updated dev dependencies to latest (non-breaking) versions
  • Fixed the performance of DFA.maximum_word_length (#202 and #205)

v8.2.0

31 Dec 00:17
ca1bc9a
Compare
Choose a tag to compare

New Features

  • Added a show_diagram method for Push Down Automata (#177)

New Documentation Site!

  • The Automata documentation has moved to a brand new documentation website with better navigation and a new search capability that should greatly improve the experience. Check it out at https://caleb531.github.io/automata/

Fixes

  • Fixed the handling of the quantifier pattern in regular expressions by switching from greedy matching (*) to lazy matching (*?) (#181)

v8.1.0

08 Oct 23:14
06cdef7
Compare
Choose a tag to compare

New Features

  • Added an optional minify step (default-enabled) to the DFA.to_partial() method

Announcements

  • We are currently in the process of submitting the Automata project to the Journal of Open Source Software (JOSS), which will introduce the project to a broader community of research using open source software
    • Please see here for the status of the approval process
    • You can read the full paper itself here

v8.0.0

18 Jul 19:18
187f2b9
Compare
Choose a tag to compare

Automata v8 is a performance and polish-focused release, providing improvements
under the hood that should improve the quality of life for most package users.

Huge thanks to @khoda81, and @EduardoGoulart1 for the new features in this release!

New Features

Jupyter Notebook Integration (#129)

All finite automata (subclasses of the FA class) now have native support for diagram
creation in Jupyter notebooks! This is enabled by installing the optional visual dependency, which includes the
pygraphviz and coloraide dependencies used in the show_diagram() methods for DFAs/NFAs.
The styling for the diagrams has been changed from previous releases and is
modeled after the diagrams used in the
visual automata package.

The show_diagram() now returns an AGraph corresponding to the given FA,
and the FA will render a diagram automatically inside of a Juyter notebook. See the
FA documentation for more details.

Type Hints (#131)

Type annotations have been added to the library, providing greater information
when using code completion tools like Pylance. This also makes it easier to type-check
application code using the library.

Better Partial DFA Support (#147)

Methods in the DFA class now have greater support for partial DFAs (ones with
some missing transitions), including support for more efficient binary operations
with partial DFAs. Partial DFAs have smaller description complexity, and thus
some algorithms are more efficient when operating on them.

In addition, the from_prefix, from_finite_language, and from_nfa DFA creation
functions now return partial DFAs by default. The new to_partial and to_complete
methods can be used to convert a DFA into an equivalent partial or complete DFA
respectively.

my_dfa.to_partial()  # get equivalent partial DFA
my_dfa.to_complete()  # get equivalent partial DFA

Please see the DFA documentation for the full details on these new
methods (outlined below).

Smaller Changes

Regex Changes

  • Quantifiers were added to the regular expression parsing, representing the repetition
    of the preceding expression. For example, the regex a{1,2} would match a between 1
    and 2 times (#121).
  • A pair of parenthesis () can now be used to represent the empty string (#136).
  • The default set of input symbols for NFA.from_regex was changed to all ascii letters and digits (#121).
  • State names of NFAs constructed from regexes no longer persist between runs of the regex engine (#130).

Pickling Support (#125)

All automata now natively support pickling.

Examples (#146)

A section has been added to the documentation demonstrating example uses of the package, can
be found here.

Breaking Changes

Dependency Changes

Python 3.7 support has been dropped. Please upgrade to Python 3.8 or later to
use Automata v8.

Diagrams are no longer being generated using pydot; this dependency has been
dropped in favor of using the visual optional dependency, which will install
pygraphviz and coloraide used for generating figures. You should install
this optional dependency if you wish to generate figures. This change was to
allow for native support for displaying finite automaton in Jupyter notebooks.
The style of the diagrams has been lifted from the visual automata package,
so you should take a look at the diagrams generated and see if they are still
satisfactory.

Other new dependencies have been added, but these will be installed automatically
along with v8 of the package.

Greater Support for Partial DFAs

There is now greater support for partial DFAs, which included changing the
DFA.from_nfa() function to return a partial DFA instead of a complete one.
To obtain a complete DFA, you must now call DFA.from_nfa().to_complete(trap_state_name),
where trap_state_name will be used as the name for a trap state if one needs to
be added.

Type Hints

Type hints have now been added, meaning that code which previously called functions
with incorrect types may not have been flagged. See output from your typechecker
for more information.

NFA.from_regex default input symbols

The default set of input symbols for NFA.from_regex was changed to all ASCII letters and digits.
If needing to use a specific set of input symbols, use the input_symbols parameter.

v7.1.0

08 Jan 23:29
b9c195a
Compare
Choose a tag to compare
  • Added new enable_mutable_automata global configuration option if you need to maximize the performance of automaton creation
  • Optimizations to automaton creation even if you leave immutable automata enabled
  • Added a new Shuffle operator (denoted with the caret symbol ^) for regular expressions (e.g. a^b represents all permutations of a and b) (#112)
  • Fixed a bug with the conversion from GNFA to regular expression (#122 and #123)
  • Minor improvements to the error messaging for the automata.regex.parser module
  • Other small optimizations and improvements

v7.0.1

13 Nov 02:43
7e860a7
Compare
Choose a tag to compare
  • Minor style/performance improvements to GNFA class
  • Corrections to documentation
    • DFA.cardinality() actually raises an InfiniteLanguageException for
      infinite languages, rather than returning float('inf')
    • DFA.maximum_word_length() actually returns None for infinite languages,
      rather than returning float('inf')
    • Added missing documentation for EmptyLanguageException

v7.0.0

11 Nov 19:42
10b6c0d
Compare
Choose a tag to compare

Automata v7 is a packed release, introducing immutable automata instances, new
DFA/NFA operations, and performance optimizations across the board to make
automaton machines easier to construct, validate, and work with. The
documentation has also been reorganized to make browsing much easier.

Huge thanks to @eliotwrobson and @Tagl for their massive code contributions to
this release, as well as their feedback on v7's development!

New Features

Immutability

All Automaton instances are now fully immutable to protect against common
pitfalls, such as mutating an automaton to an invalid state after it's already
been validated.

This means that if you wish to make a change to an automaton instance, you must
retrieve its attributes as a dictionary (using the new input_parameters
property), make your desired change, then pass those parameters to the relevant
constructor. For example:

from automata.fa.dfa import DFA

dfa1 = DFA(
    states={'q0', 'q1', 'q2'},
    input_symbols={'0', '1'},
    transitions={
        'q0': {'0': 'q0', '1': 'q1'},
        'q1': {'0': 'q0', '1': 'q2'},
        'q2': {'0': 'q2', '1': 'q1'}
    },
    initial_state='q0',
    final_states={'q1'}
)
# You can still copy an automaton just fine
dfa2 = dfa.copy()
# If you want to make a change, you must create a new instance; please note
# that dfa2.input_parameters is always a deep copy of the input parameters for
# dfa2 (in other words, mutating dfa2.input_parameters will not actually mutate
# dfa2)
params = dfa2.input_parameters
params['final_states'] = {'q2'}
dfa3 = DFA(**params)

DFA and NFA Equivalence via ==

You can now use == to check if two DFA accept the same language. You can also
use == to check if two NFA instances accept the same language.

dfa1 == dfa2
nfa1 == nfa2

New DFA Methods

DFAs now include more methods for working with accepted languages. For example,
the library can now build you a DFA from a language represented by any sequence
of strings. Similarly, you can now iterate over a DFA, which will yield a string
accepted by that DFA at each iteration.

len(my_dfa)  # get the cardinality of a DFA
for word in my_dfa:  # loop over every string accepted by the DFA
    print(word)

Please see the DFA documentation for the full details on these new
methods (outlined below).

Click to expand new DFA methods

DFA.from_finite_language(cls, input_symbols, language)

Constructs the minimal DFA corresponding to the given finite language and input symbols.

DFA.from_finite_language(
    language={'aa', 'aaa', 'aaba', 'aabbb', 'abaa', 'ababb', 'abbab',
              'baa', 'babb', 'bbaa', 'bbabb', 'bbbab'},
    input_symbols={'a', 'b'})

DFA.complement(self, retain_names=False, minify=True)

Returns a new DFA which accepts all input rejected by the DFA on which
complement() is called.

minimal_complement_dfa = ~dfa
complement_dfa = dfa.complement(minify=False)

DFA.predecessor(self, input_str, strict=True, key=None)

Returns the first string accepted by the DFA that comes before the input string
in lexicographical order.

prev_word = dfa.predecessor('0110')
same_word = dfa.predecessor(prev_word, strict=False)

DFA.predecessors(self, input_str, strict=True, key=None)

Generates all strings that come before the input string in lexicographical
order.

# Generates all strings in a language in lexographical order
for word in dfa.predecessors(None):
    print(word)

DFA.successor(self, input_str, strict=True, key=None)

Returns the first string accepted by the DFA that comes after the input string
in lexicographical order.

next_word = dfa.successor('0110')
same_word = dfa.predecessor(next_word, strict=False)

DFA.successors(self, input_str, strict=True, key=None, reverse=False)

Generates all strings that come after the input string in lexicographical order

# Generates all strings in a language in lexographical order
for word in dfa.successors(None):
    print(word)

DFA.minimum_word_length(self)

Returns the length of the shortest word in the language represented by the DFA.

dfa.minimum_word_length()

DFA.maximum_word_length(self)

Returns the length of the longest word in the language represented by the DFA.
In the case of infinite languages, float('inf') is returned.

dfa.maximum_word_length()

DFA.count_words_of_length(self, k)

Counts words of length k accepted by the DFA.

dfa.count_words_of_length(3)

DFA.words_of_length(self, k)

Generates words of length k accepted by the DFA.

for word in dfa.words_of_length(3):
    print(word)

You can also iterate through all words accepted by the DFA.
Be aware that the generator may be infinite.

for word in dfa:
    if len(word) > 10:
        break
    print(word)

DFA.cardinality(self)

Returns the cardinality of the language represented by the DFA. Note that
len(dfa) raises a ValueError for infinite languages, whereas
DFA.cardinality will return float('inf').

dfa.cardinality()
len(dfa)

DFA.from_prefix(cls, input_symbols, prefix, contains=True)

Directly computes the minimal DFA recognizing strings with the given prefix.

contains_prefix_nano = DFA.from_prefix({'a', 'n', 'o', 'b'}, 'nano')
avoids_prefix_nano = DFA.from_prefix({'a', 'n', 'o', 'b'}, 'nano', contains=False)

DFA.from_suffix(cls, input_symbols, suffix, contains=True)

Directly computes the minimal DFA recognizing strings with the given prefix.

contains_suffix_nano = DFA.from_suffix({'a', 'n', 'o', 'b'}, 'nano')
avoids_suffix_nano = DFA.from_suffix({'a', 'n', 'o', 'b'}, 'nano', contains=False)

DFA.from_substring(cls, input_symbols, substring, contains=True, must_be_suffix=False)

Directly computes the minimal DFA recognizing strings containing the given
substring.

contains_substring_nano = DFA.contains_substring({'a', 'n', 'o', 'b'}, 'nano')
avoids_substring_nano = DFA.contains_substring({'a', 'n', 'o', 'b'}, 'nano', contains=False)

DFA.from_subsequence(cls, input_symbols, subsequence, contains=True)

Directly computes the minimal DFA recognizing strings containing the given
subsequence.

contains_substring_dcba = DFA.contains_subsequence({'a', 'b', 'c', 'd'}, 'dcba')
avoids_substring_dcba = DFA.contains_subsequence({'a', 'b', 'c', 'd'}, 'dcba', contains=False)

DFA.of_length(cls, input_symbols, min_length=0, max_length=None, symbols_to_count=None)

Directly computes the minimal DFA which accepts all words whose length is between min_length
and max_length, inclusive. To allow infinitely long words, the value None can be
passed in for max_length. If symbols_to_count is None (default behavior), then counts
all symbols. Otherwise, only counts symbols present in the set symbols_to_count and
ignores other symbols.

DFA.count_mod(cls, input_symbol, k, remainders=None, symbols_to_count=None)

Directly computes a DFA that counts given symbols and accepts all strings where
the remainder of division by k is in the set of remainders given. The
default value of remainders is {0} and all symbols are counted by default.

even_length_strings = DFA.count_mod({'0', '1'}, 2)
odd_number_of_ones = DFA.count_mod({'0', '1'}, 2, remainders={1}, symbols_to_count={'1'})

DFA.nth_from_start(cls, input_symbols, symbol, n)

Directly computes the minimal DFA which accepts all words whose n-th character from the end is symbol, where n is a positive integer.

dfa = DFA.nth_from_start({'0', '1'}, '1', 4)

DFA.nth_from_end(cls, input_symbols, symbol, n)

Directly computes the minimal DFA which accepts all words whose n-th character from the end is symbol, where n is a positive integer.

dfa = DFA.nth_from_end({'0', '1'}, '1', 4)

DFA.count_words_of_length()

Counts words (of the given length) that are accepted by the DFA

dfa.count_words_of_length(3)

DFA.universal_language(cls, input_symbols)

Returns a new DFA which accepts all strings formed from the given input symbols.

DFA.universal_language(input_symbols={'a', 'b'})

DFA.empty_language(cls, input_symbols)

Returns a new DFA which rejects all strings formed from the given input symbols.

DFA.empty_language(input_symbols={'a', 'b'})

New NFA Methods

Please see the NFA documentation for the full details on these new
methods.

Click to expand new NFA methods

NFA.left_quotient(self, other)

Returns the left quotient of one NFA with respect to another.

new_nfa = nfa1.left_quotient(nfa2)

NFA.right_quotient(self, other)

Returns the right quotient of one NFA with respect to another.

new_nfa = nfa1.right_quotient(nfa2)

NFA.shuffle_product(self, other)

Returns shuffle product of two NFAs. This produces an NFA that accepts all
interleavings of strings in the input NFAs.

new_nfa = nfa1.shuffle_product(nfa2)

NFA.edit_distance(cls, input_symbols, reference_str, max_edit_distance, insertion=True, deletion=True, substitution=True)

Constructs the NFA for the given reference_str for the given Leve...

Read more

v6.0.2

27 Aug 01:04
f7acee8
Compare
Choose a tag to compare
  • Fixed a critical bug caused by v6.0.1, where v6.0.1 attempted to implement the missing NFA.to_regex() method. However, the addition of the method caused a circular import error that was forced to be remove for the sake of the library's stability. So sorry about that.

v6.0.1

27 Aug 00:39
317570f
Compare
Choose a tag to compare
  • Added missing implementation for NFA.to_regex(), which was absent from the v6.0.0 release. Sorry about that!
  • Corrections to documentation in README

v6.0.0

26 Aug 01:23
bb6d5f9
Compare
Choose a tag to compare

Automata v6 is a major new release with new features and a few breaking changes that keep the library moving forward. Big thanks to @abhinavsinha-adrino and @eliotwrobson for most of these improvements.

New Features

  • A new GNFA class to represent generalized non-deterministic finite automata (thanks @abhinavsinha-adrino!)
    • Please see the README for documentation on this new class
  • A new automata.base.regex module with utilities for working with
    automata-style regular expressions; a new NFA.to_regex method has also been added
    • Please see the README for documentation on this new module
  • A new NFA.eliminate_lambda method to return an NFA with lambda/epsilon
    transitions removed (#31)
  • Significant performance optimizations to various DFA methods (thanks
    @eliotwrobson!):
    • DFA.from_nfa
    • DFA.isempty
    • DFA.isfinite
    • DFA.__eq__

Breaking Changes

  • Added networkx as a required dependency, providing substantial performance improvements for certain DFA/NFA methods, and also streamlining the code to improve maintainability
  • Dropped Python 3.6 support, since it has been end-of-life since December 2021. Python 3.7 is currently the minimum version of Python required by this library.

Bug Fixes

  • Fixed a bug with NFA.kleene_star() where outgoing transitions for every final state were required (but shouldn't be)
  • Fixed a bug where the DFA union and intersection operations were not returning the correct type if you had subclassed from DFA