Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new iterator for TREE index: NP (NextPrefix) #9994

Closed
unera opened this issue May 6, 2024 · 3 comments · Fixed by #10028
Closed

new iterator for TREE index: NP (NextPrefix) #9994

unera opened this issue May 6, 2024 · 3 comments · Fixed by #10028
Assignees
Labels
feature A new functionality

Comments

@unera
Copy link
Collaborator

unera commented May 6, 2024

example table contains PRIMARY KEY

- foo
- foo bar
- foo bar baz
- g
- g abc
- g cde
- g cde def
for _, tuple in space:pairs('foo', { iterator = 'GE' }) do
   print(tuple)
end

- foo
- foo bar
...
for _, tuple in space:pairs('foo', { iterator = 'NP' }) do
  print(tuple)
end

- g
- g abc

the iterator works like GT but for the first key it can receive a string that can be part of the field (prefix substring).
the first iteration must return value which prefix is more than noticed in key.

@unera unera added the feature A new functionality label May 6, 2024
@unera
Copy link
Collaborator Author

unera commented May 6, 2024

We need the iterator for S3 that operates by triplets

- bucket_id
- key
- version
for _, file in space:pairs({bucket, name}, {iterator = 'NP'}) do
   print('next name is ', file.name)
   break
end

@alyapunov
Copy link
Contributor

alyapunov commented May 6, 2024

I would define it more formally.

Let's define tuple_key as an array of values of a tuple that correspond to index parts. For example if a space has format
{{'a', 'string'}, {'b', 'string'}, {'c', 'map'}} and its index has parts {{'b'}, {'c.e', 'string'}} then tuple {'u', 'v', {d = 'x', e = 'y', f = 'z'}} will have tuple_key as {'v', 'y'}.
Let search_key be the array of values that passed to pairs or select as the first argument.

For example usual definition that a tuple_key {t1, t2,... tn} is greater than a search_key {s1, s2,... sm} is that there is such k: 1 <= k <= m that ti == si for i = 1, k - 1 AND tk > sk (keep in mind that in case of descending part '>' must be replaced with '<').

It seems that 'np' aka 'NextPrefix' iterator is only meaningful if the last part of search key and the corresponding tuple key part are strings. If it is not so, let's define behavior of 'np' iterator to be the same as 'gt' iterator. The following definition is for strings.

Let's define a property that a value is greater than a search_value in terms or prefix:

return string.sub(value, 1, #search_value) > search_value
-- keep in mind that in case of descending part '>' must be replaced with '<'.

Let's define a property that a tuple_key {t1, t2,... tn} is greater than a search_key {s1, s2,... sm} in terms or prefix if the tuple_key is greater that search_subkey {s1, s2,... sm-1} OR tuple_key is equal to search_subkey {s1, s2,... sm-1} while the last part tm is greater than sm in terms of prefix.

A tuple is greater than a search_key in terms of prefix if its tuple_key in that index s greater than the search_key in terms of prefix.

So the 'np' iterator must start from the first tuple in the index that is greater than the search_key in terms of prefix, and the continue normally as any forward iterator ('gt' for example) would continue.

@unera
Copy link
Collaborator Author

unera commented May 6, 2024

@alyapunov Yes, You are right.

The iterator selects the first tuple by rule:

string.sub(value, 1, #search_value) > search_value

all the following tuples select as in 'GT/GE' iterators.

alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2024
Implement 'np' (next prefix) and 'pp' (previous prefix) iterators.
They work only in memtx tree and in a nutshell searches for
strings with greater ('np') or less ('pp') prefix of size as in
given key, comparing with given key.

Closes tarantool#9994

@TarantoolBot document
Title:'np' and 'pp' (next/previous prefix) iterators

Now there are two more iterators avaialbe: 'np' (next prefix)
and 'pp' (previous prefix). They work only in memtx tree. Also,
if the last part of key is not a string, they degrade to 'gt'
and 'lt' iterators.

These iterators introduce special comparison of the last part of
key (if it is a string). In terms of lua, if s is the search part,
and t is the corresponding tuple part, 'np' iterator searches for
the first tuple with string.sub(t, 1, #s) > s, while 'pp' searches
for the last tuple with string.sub(t, 1, #s) < s.

Comparison of all other parts of the key remains normal.

As usual, these iterators are available both in select and pairs,
in index and space methods.

Similar to all other tree iterators, they change only initial
search of selection. Once the first tuple found, the rest are
selected sequentially in direct (for 'np') or reverse (for 'pp')
order of the index.

For example:
```
tarantool> s:select{}
---
- - ['a']
  - ['aa']
  - ['ab']
  - ['b']
  - ['ba']
  - ['bb']
  - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'np'})
---
- - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'pp'})
---
- - ['ab']
  - ['aa']
  - ['a']
...
```
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2024
Implement 'np' (next prefix) and 'pp' (previous prefix) iterators.
They work only in memtx tree and in a nutshell searches for
strings with greater ('np') or less ('pp') prefix of size as in
given key, comparing with given key.

Closes tarantool#9994

@TarantoolBot document
Title: 'np' and 'pp' (next/previous prefix) iterators

Now there are two more iterators avaialbe: 'np' (next prefix)
and 'pp' (previous prefix). They work only in memtx tree. Also,
if the last part of key is not a string, they degrade to 'gt'
and 'lt' iterators.

These iterators introduce special comparison of the last part of
key (if it is a string). In terms of lua, if s is the search part,
and t is the corresponding tuple part, 'np' iterator searches for
the first tuple with string.sub(t, 1, #s) > s, while 'pp' searches
for the last tuple with string.sub(t, 1, #s) < s.

Comparison of all other parts of the key remains normal.

As usual, these iterators are available both in select and pairs,
in index and space methods.

Similar to all other tree iterators, they change only initial
search of selection. Once the first tuple found, the rest are
selected sequentially in direct (for 'np') or reverse (for 'pp')
order of the index.

For example:
```
tarantool> s:select{}
---
- - ['a']
  - ['aa']
  - ['ab']
  - ['b']
  - ['ba']
  - ['bb']
  - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'np'})
---
- - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'pp'})
---
- - ['ab']
  - ['aa']
  - ['a']
...
```
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2024
Implement 'np' (next prefix) and 'pp' (previous prefix) iterators.
They work only in memtx tree and in a nutshell searches for
strings with greater ('np') or less ('pp') prefix of size as in
given key, comparing with given key.

Closes tarantool#9994

@TarantoolBot document
Title: 'np' and 'pp' (next/previous prefix) iterators

Now there are two more iterators available: 'np' (next prefix)
and 'pp' (previous prefix). They work only in memtx tree. Also,
if the last part of key is not a string, they degrade to 'gt'
and 'lt' iterators.

These iterators introduce special comparison of the last part of
key (if it is a string). In terms of lua, if s is the search part,
and t is the corresponding tuple part, 'np' iterator searches for
the first tuple with string.sub(t, 1, #s) > s, while 'pp' searches
for the last tuple with string.sub(t, 1, #s) < s.

Comparison of all other parts of the key remains normal.

As usual, these iterators are available both in select and pairs,
in index and space methods.

Similar to all other tree iterators, they change only initial
search of selection. Once the first tuple found, the rest are
selected sequentially in direct (for 'np') or reverse (for 'pp')
order of the index.

For example:
```
tarantool> s:select{}
---
- - ['a']
  - ['aa']
  - ['ab']
  - ['b']
  - ['ba']
  - ['bb']
  - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'np'})
---
- - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'pp'})
---
- - ['ab']
  - ['aa']
  - ['a']
...
```
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 21, 2024
Implement 'np' (next prefix) and 'pp' (previous prefix) iterators.
They work only in memtx tree and in a nutshell searches for
strings with greater ('np') or less ('pp') prefix of size as in
given key, comparing with given key.

Closes tarantool#9994

@TarantoolBot document
Title: 'np' and 'pp' (next/previous prefix) iterators

Now there are two more iterators available: 'np' (next prefix)
and 'pp' (previous prefix). They work only in memtx tree. Also,
if the last part of key is not a string, they degrade to 'gt'
and 'lt' iterators.

These iterators introduce special comparison of the last part of
key (if it is a string). In terms of lua, if s is the search part,
and t is the corresponding tuple part, 'np' iterator searches for
the first tuple with string.sub(t, 1, #s) > s, while 'pp' searches
for the last tuple with string.sub(t, 1, #s) < s.

Comparison of all other parts of the key remains normal.

As usual, these iterators are available both in select and pairs,
in index and space methods.

Similar to all other tree iterators, they change only initial
search of selection. Once the first tuple found, the rest are
selected sequentially in direct (for 'np') or reverse (for 'pp')
order of the index.

For example:
```
tarantool> s:select{}
---
- - ['a']
  - ['aa']
  - ['ab']
  - ['b']
  - ['ba']
  - ['bb']
  - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'np'})
---
- - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'pp'})
---
- - ['ab']
  - ['aa']
  - ['a']
...
```
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 22, 2024
Implement 'np' (next prefix) and 'pp' (previous prefix) iterators.
They work only in memtx tree and in a nutshell searches for
strings with greater ('np') or less ('pp') prefix of size as in
given key, comparing with given key.

Closes tarantool#9994

@TarantoolBot document
Title: 'np' and 'pp' (next/previous prefix) iterators

Now there are two more iterators available: 'np' (next prefix)
and 'pp' (previous prefix). They work only in memtx tree. Also,
if the last part of key is not a string, they degrade to 'gt'
and 'lt' iterators.

These iterators introduce special comparison of the last part of
key (if it is a string). In terms of lua, if s is the search part,
and t is the corresponding tuple part, 'np' iterator searches for
the first tuple with string.sub(t, 1, #s) > s, while 'pp' searches
for the last tuple with string.sub(t, 1, #s) < s.

Comparison of all other parts of the key remains normal.

As usual, these iterators are available both in select and pairs,
in index and space methods.

Similar to all other tree iterators, they change only initial
search of selection. Once the first tuple found, the rest are
selected sequentially in direct (for 'np') or reverse (for 'pp')
order of the index.

For example:
```
tarantool> s:select{}
---
- - ['a']
  - ['aa']
  - ['ab']
  - ['b']
  - ['ba']
  - ['bb']
  - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'np'})
---
- - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'pp'})
---
- - ['ab']
  - ['aa']
  - ['a']
...
```
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 22, 2024
Implement 'np' (next prefix) and 'pp' (previous prefix) iterators.
They work only in memtx tree and in a nutshell searches for
strings with greater ('np') or less ('pp') prefix of size as in
given key, comparing with given key.

Closes tarantool#9994

@TarantoolBot document
Title: 'np' and 'pp' (next/previous prefix) iterators

Now there are two more iterators available: 'np' (next prefix)
and 'pp' (previous prefix). They work only in memtx tree. Also,
if the last part of key is not a string, they degrade to 'gt'
and 'lt' iterators.

These iterators introduce special comparison of the last part of
key (if it is a string). In terms of lua, if s is the search part,
and t is the corresponding tuple part, 'np' iterator searches for
the first tuple with string.sub(t, 1, #s) > s, while 'pp' searches
for the last tuple with string.sub(t, 1, #s) < s.

Comparison of all other parts of the key remains normal.

As usual, these iterators are available both in select and pairs,
in index and space methods.

Similar to all other tree iterators, they change only initial
search of selection. Once the first tuple found, the rest are
selected sequentially in direct (for 'np') or reverse (for 'pp')
order of the index.

For example:
```
tarantool> s:select{}
---
- - ['a']
  - ['aa']
  - ['ab']
  - ['b']
  - ['ba']
  - ['bb']
  - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'np'})
---
- - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'pp'})
---
- - ['ab']
  - ['aa']
  - ['a']
...
```
alyapunov added a commit that referenced this issue May 22, 2024
Implement 'np' (next prefix) and 'pp' (previous prefix) iterators.
They work only in memtx tree and in a nutshell searches for
strings with greater ('np') or less ('pp') prefix of size as in
given key, comparing with given key.

Closes #9994

@TarantoolBot document
Title: 'np' and 'pp' (next/previous prefix) iterators

Now there are two more iterators available: 'np' (next prefix)
and 'pp' (previous prefix). They work only in memtx tree. Also,
if the last part of key is not a string, they degrade to 'gt'
and 'lt' iterators.

These iterators introduce special comparison of the last part of
key (if it is a string). In terms of lua, if s is the search part,
and t is the corresponding tuple part, 'np' iterator searches for
the first tuple with string.sub(t, 1, #s) > s, while 'pp' searches
for the last tuple with string.sub(t, 1, #s) < s.

Comparison of all other parts of the key remains normal.

As usual, these iterators are available both in select and pairs,
in index and space methods.

Similar to all other tree iterators, they change only initial
search of selection. Once the first tuple found, the rest are
selected sequentially in direct (for 'np') or reverse (for 'pp')
order of the index.

For example:
```
tarantool> s:select{}
---
- - ['a']
  - ['aa']
  - ['ab']
  - ['b']
  - ['ba']
  - ['bb']
  - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'np'})
---
- - ['c']
  - ['ca']
  - ['cb']
...

tarantool> s:select({'b'}, {iterator = 'pp'})
---
- - ['ab']
  - ['aa']
  - ['a']
...
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants