Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relationships as a set? #124

Open
pohutukawa opened this issue Sep 9, 2018 · 3 comments
Open

relationships as a set? #124

pohutukawa opened this issue Sep 9, 2018 · 3 comments

Comments

@pohutukawa
Copy link

I've been modelling PROV graphs a few times, and noticed, that relationships are not stored as a set of unique edges.

For example, I'm expressing that one actor (a person) is a delegate of another actor (an organisation), using the actedOnBehalfOf relationship. Unfortunately, if I'm running certain things through a function that will make such connections automatically, I will end up with several (identical) actedOnBehalfOf relationships between the same two nodes in the 'wired up' form.

I can see that there may be different relationships between the same source and destination vertex, however if they are identical, I suppose they should be recorded only once, right?

Or did I misunderstand something here?

If the behaviour is intended, then it is quite cumbersome to drill down into the RPOV document data structures, and pry out whether a particular delegation record does already exist, and then avoid adding an additional one.

@pohutukawa
Copy link
Author

Example:
image

@trungdong
Copy link
Owner

PROV allows multiple relations of the same type to exist between elements. Hence, the current behaviour is expected.

However, I think we can achieve that you want by extending the prov.model.ProvBundle.unified to merge all the identical records. This should be straight-forward, putting all the records into a set and recreate a ProvBundle with the remaining records.

The other alternative is to convert a PROV document into RDF, which I think should merge the identical triples.

Note: ProvRecord has the hash function.

@pohutukawa
Copy link
Author

Yes, the thing that strikes me as being redundant is the multiple relations of the same type between the same two nodes without any differences. If there were somehow labels attached to one or the other that's different, I could see a point in it. Especially, if some of the serialisations (as the above mentioned PROV-O) won't even handle it.
I'll have a try at using the ProvBundle.unified. Otherwise, may there be a demand for the ProvDocument constructor with an optional argument to forbid/allow for duplicate relationships of the identical kind?
Also, with the knowledge of ProvRecord to be hashable, I'll see if I can simplify my code for detecting duplicates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants