[WIP] Doc/add use cases #215

ZenGround0 · 2017-10-29T22:58:40Z

This is not close to done yet as most use case work is missing. @hsanjuan if you get a chance to look this over I'd like to know if

I'm saying anything false or that you disagree with particularly in the first two sections
If the first two sections belong in this document (probably not), or in this repo at all (maybe not). I can see that these sections might seem kind of off topic for use case descriptions. They are the result of my making sense of scattered historical information that has made the motivation of the project much clearer, and has gotten me up to speed on prior thought regarding ipfs-cluster user interfaces. In my head both UI and general motivation are strongly tied to use cases and writing this stuff helped me and might help others understand the project.
Any input on the informal use case structure and how it could be improved

…f first use case written

coveralls · 2017-10-29T23:17:36Z

Coverage decreased (-0.09%) to 74.805% when pulling a6b55c0 on doc/add-use-cases into e51f771 on master.

coveralls · 2017-10-29T23:37:37Z

Coverage decreased (-0.02%) to 74.87% when pulling 7f5625a on doc/add-use-cases into e51f771 on master.

hsanjuan

Hey, I think this is a good start. A use-case doc would probably be much longer and detailed, but we also need to decide which particular use-case are worth that effort (that'd be because we aim to make them happen). Until then it's great to gather all the pieces of information in one place like this.

hsanjuan · 2017-11-08T09:18:04Z

docs/motivation-and-use-cases.md

+Early discussion, again in https://github.com/ipfs/notes/issues/58, outlines a particular approach that remains relevant to discussion today: ipfs-cluster as a virtual ipfs node (vNode for short).  The idea is that ipfs-cluster nodes could implement the ipfs api, only exposing state agreed upon by all nodes through consensus.  A simple example: all nodes in the cluster agree on a single peer id, and after reaching agreement all nodes respond with this id to requests on their ipfs vNode id endpoint.  A more useful example: an ipfs-cluster node gets an `add <file>` request through the ipfs vNode add endpoint, the cluster nodes coordinate adding this file, perhaps doing something clever like replicating file data across multiple nodes, and reach agreement that the data was added and pinned successfully via consensus.  After all this occurs the api endpoint returns the normal message indicating success.
+
+Designing ipfs-cluster to work this way has some benefits, including a familiar api for user interaction, the ability to use a cluster anywhere an ipfs node is used and the ability to make ipfs-clusters depend on other ipfs-clusters as the ipfs nodes that they coordinate.  This last property has the potential to make scaling ipfs-cluster easier; if large groups of participants can be abstracted away consensus peer group size can remain bounded as cluster participants grow arbitrarily.  It is not always the case that an ipfs api is the best user interface for adding files to ipfs-cluster.  If ipfs-cluster were to support behavior like per-pin replication configuration information, for example different pins specifying different replication factors as it does today, then the ipfs api has no endpoint to encode this information and some kind of cluster specific interface is needed  (See discussion https://botbot.me/freenode/ipfs/2017-02-09/ here for a somewhat related conversation that includes discussion of more use cases).
+


We do support 'per-pin' replication configuration information. You can assign a different replication factor to every pin.

ZenGround0 · 2017-11-17T01:06:38Z

@flyingzumwalt today during all hands you mentioned having come across potential use cases while coordinating with data-together. If you have any of these in writing, very brief descriptions are fine, please feel free to add them below in the comments so that I can include them in our aggregation and keep track. Thank you!

License: MIT Signed-off-by: Wyatt Daviau <wdaviau@cs.stanford.edu>

GitCop · 2017-12-07T22:42:55Z

There were the following issues with your Pull Request

Commit: 52c1691
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:
Your subject line is longer than 80 characters
Commit: a6b55c0
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:
Commit: 7f5625a
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:

Guidelines are available at https://github.com/ipfs/ipfs-cluster/blob/master/contribute.md

This message was auto-generated by https://gitcop.com

License: MIT Signed-off-by: Wyatt Daviau <wdaviau@cs.stanford.edu>

GitCop · 2017-12-18T01:35:33Z

There were the following issues with your Pull Request

Commit: 52c1691
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:
Your subject line is longer than 80 characters
Commit: a6b55c0
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:
Commit: 7f5625a
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:

Guidelines are available at https://github.com/ipfs/ipfs-cluster/blob/master/contribute.md

This message was auto-generated by https://gitcop.com

coveralls · 2017-12-18T01:54:39Z

Coverage decreased (-2.08%) to 72.817% when pulling 305b2fc on doc/add-use-cases into e51f771 on master.

License: MIT Signed-off-by: Wyatt Daviau <wdaviau@cs.stanford.edu>

GitCop · 2017-12-18T03:07:22Z

There were the following issues with your Pull Request

Commit: 52c1691
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:
Your subject line is longer than 80 characters
Commit: a6b55c0
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:
Commit: 7f5625a
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:

Guidelines are available at https://github.com/ipfs/ipfs-cluster/blob/master/contribute.md

This message was auto-generated by https://gitcop.com

coveralls · 2017-12-18T03:28:48Z

Coverage decreased (-2.09%) to 72.799% when pulling 8b9f65f on doc/add-use-cases into e51f771 on master.

License: MIT Signed-off-by: Wyatt Daviau <wdaviau@cs.stanford.edu>

GitCop · 2017-12-22T21:14:58Z

There were the following issues with your Pull Request

Commit: 52c1691
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:
Your subject line is longer than 80 characters
Commit: a6b55c0
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:
Commit: 7f5625a
Invalid signoff. Commit message must end with
License: MIT
Signed-off-by:

Guidelines are available at https://github.com/ipfs/ipfs-cluster/blob/master/contribute.md

This message was auto-generated by https://gitcop.com

ZenGround0 · 2017-12-22T21:20:49Z

@hsanjuan A lot of this is still pretty rough. It was nice to get some rough ideas down though. If you get a chance and have any interest in checking this out feedback is super welcome. Also feel free to add in the gateway stuff if you like.

coveralls · 2017-12-22T21:36:07Z

Coverage decreased (-2.08%) to 72.817% when pulling d2abac4 on doc/add-use-cases into e51f771 on master.

hsanjuan

Here, I have read through and added some thoughts. I think the main usecases are here.

The mirror (with large trees support), cdn, pinning rings are probably the most important ones that we'd need to really flesh out in terms of requirements that can be eventually translated into OKRs (maybe not for this quarter but for the future)

hsanjuan · 2018-01-04T14:11:41Z

docs/motivation-and-use-cases.md

+
+Description: An ipfs user with multiple machines wants to run their ipfs node with better replication or availability guarantees.  The user creates an ipfs cluster across machines.  Adding content to ipfs automatically triggers a pin in the cluster according to some predetermined replication strategy.  The user advertises multiaddresses from all the ipfs daemons as multiaddresses of the ipfs node mirror.
+
+Thoughts: This use case, like others, requires some mechanism for automatic pinning upon adding to ipfs.  Depending on the adding mechanism this might be straightforward functionality to add to cluster, for example cluster's ipfs proxy add endpoint will probably eventually do this by default.  However in some cases, such as if each machine writes to ipfs over the fuse interface, this would be more difficult.  A mirrored node could potentially advertise itself as an ipfs node and advertise the cluster ipfs proxy endpoint addresses as its multiaddresses.


"will probably eventually do this by default" -> "does this".

There was some talk of go-ipfs providing websockets endpoint which could send events. This would provide a very nice way to be informed about pins/unpins in ipfs and do automatic mirroring.

Following up on an old thread -- do we do this now? Is this how Cluster works? -->

An IPFS user with multiple computers wants to make sure their content is always available on the IPFS network. That means they need to make sure their computer is always on and connected to the network, or that they’ve made copies on other computers so that if one goes down, there’s still copies of their content via other hard drives on the network.

Before IPFS Cluster, this user would have needed to do [some really annoying thing]. Now, this user can use Cluster to create a connected set of locations to store their content. When the user adds content to IPFS, Cluster automatically creates copies across the user’s computers according to their settings. Perhaps this user is very worried about losing access to this content, and so sets Cluster to make 100 copies; perhaps the user is less worried about consistent access, and so sets Cluster to make only one other copy.

As computers come online and offline, Cluster also takes care of maintaining the user’s replication strategy: the user might always want 100 copies of their content available, but the exact computers storing this information might change. There will, however, always be 100 copies available. People are able to find content because Cluster is [able to do something nifty around mirroring and multiaddresses that I need to understand still].

Is this right? (I'm hoping to slowly get a handle on how to describe what Cluster does so we can make a more plain-language version of our intro/high-level docs.)

hsanjuan · 2018-01-04T14:16:01Z

docs/motivation-and-use-cases.md

+
+These are some WIP ipfs-cluster use case sketches.  These are not formal use cases and are more accurately groups of related use cases; they could be further decomposed into the more narrowly scoped operations found in formal use cases.
+
+## ipfs node mirror


We could add a field that specifies a list of things needed to support this usecase, and which of those are already implemented. It does not need to be exhaustive, but give an overview.

Great idea, this is something I can work on

hsanjuan · 2018-01-04T14:23:25Z

docs/motivation-and-use-cases.md

+3. As a last example imagine a cluster serving as storage backing searching for potentially large queries over blockchains.  For example say you want to look for all transactions that spend from the output transactions from block X.  You could potentially use ipld selectors to query and pin all of the relevant hashes in an ipfs node, but what if there is too much data to reasonably fit on any one ipfs node's machine?  A group of trusted nodes could be brought together to handle such queries as an ipfs-cluster and avoid running out of space for storing the results.
+
+Thoughts:
+1. To seriously support storing data for miners we would probably need to examine ipfs-cluster's latency profile more seriously. The above description does not specify how to prevent the set of pinned cids from ballooning quickly.  This use case would require some kind of sharding of the transaction set in a way that does not track every transactions pin in the cluster shared state.  Because new transactions would be added all the time this is not a simple application of basic sharding, which does one import of a huge file into shards.  ipfs-cluster could address this with a strategy for incrementally updating shard membership.  The state blow-up could also be mitigated if the cluster recursively (to a certain depth) pinned larger subdags dags, i.e. the hash of every X blocks.  The security of go-ipfs would need to be vetted more thoroughly so that users could trust that hash lookups securely resolve to the correct data.  In general this use case idea needs more domain specific knowledge.  Is blockchain storage currently a pain point for mining operations that ipfs and cluster could address?  How do popular mining clients (bitcoin-core, geth) handle blockchain storage?  Would ipfs integration using ipfs-cluster for storing large merkle dags be possible/useful/welcome for these clients? What are their requirements (latency, security models, integration with existing tools)?


I don't think ipfs or cluster are ever going to beat local blockchain storage and in-mem caches but I do wonder what's going to happen when a chain is "too big to store"

It's interesting that ipfs/cluster can nevertheless be used to distribute blockchain data to the rest of the network (new blocks, chain downloads etc). ipfs+ipld already supports ingesting bitcoin/eth blocks to ipfs. Would be awesome if the chain sync operation would just fetch stuff from ipfs to the chain database for a start.

hsanjuan · 2018-01-04T14:23:55Z

docs/motivation-and-use-cases.md

+Thoughts:
+1. To seriously support storing data for miners we would probably need to examine ipfs-cluster's latency profile more seriously. The above description does not specify how to prevent the set of pinned cids from ballooning quickly.  This use case would require some kind of sharding of the transaction set in a way that does not track every transactions pin in the cluster shared state.  Because new transactions would be added all the time this is not a simple application of basic sharding, which does one import of a huge file into shards.  ipfs-cluster could address this with a strategy for incrementally updating shard membership.  The state blow-up could also be mitigated if the cluster recursively (to a certain depth) pinned larger subdags dags, i.e. the hash of every X blocks.  The security of go-ipfs would need to be vetted more thoroughly so that users could trust that hash lookups securely resolve to the correct data.  In general this use case idea needs more domain specific knowledge.  Is blockchain storage currently a pain point for mining operations that ipfs and cluster could address?  How do popular mining clients (bitcoin-core, geth) handle blockchain storage?  Would ipfs integration using ipfs-cluster for storing large merkle dags be possible/useful/welcome for these clients? What are their requirements (latency, security models, integration with existing tools)?
+
+2. For the second example you could imagine that the block explorer is a dApp with users running ipfs and caching blockchain data as they look it up and the ipfs-cluster acts as a permanent store for slower lookups (similar pattern to use case below).  If this becomes a serious use case we should investigate pain points that currently exist for block explorer websites to get a better picture of how cluster would fit in.


hsanjuan · 2018-01-04T14:25:22Z

docs/motivation-and-use-cases.md

+
+2. For the second example you could imagine that the block explorer is a dApp with users running ipfs and caching blockchain data as they look it up and the ipfs-cluster acts as a permanent store for slower lookups (similar pattern to use case below).  If this becomes a serious use case we should investigate pain points that currently exist for block explorer websites to get a better picture of how cluster would fit in.
+
+3. This is a very undeveloped idea (ipld selectors don't exist yet, I haven't seen anyone ask for this) based on my impression that quick, expressive merkle dag searches would be valuable.  I should investigate work that exists along these lines (ex, how do current block explorer's do queries over merkle trees?), and how ipld selectors compare as a next step.


yeah, to do something like <ethhead>/block/*/transactions/*/from/<addr>/value

hsanjuan · 2018-01-04T15:12:36Z

docs/motivation-and-use-cases.md

+- support for dynamic cluster membership (exists today but potentially some bugs with lots of churn)
+- some kind of trust modeling support, potentially including associating permissions to operations and assigning permissions to peers.  Could make use of the proposed [capabilities service](https://github.com/ipfs/notes/issues/274).
+- support for byzantine consensus protocols sounds relevant
+- support for updating many uncoordinated nodes


The main problem here is the trust model. I can think of several ways of approaching this usecase but I always find problems with how to prevent random users from altering the whole cluster. I think with Raft this is limited to who controls the cluster leader node, but Raft doesn't scale for this.

Can think of adding authorization to RPC calls

With the above, can think of only allowing trusted nodes to become leaders (Raft supports this)

Can think of implementing this as an application wrapping cluster/ipfs, in which the user does not run the cluster peer but only the ipfs-daemon, and the administrator runs an associated cluster peer but under his control. This would scale with composite clusters.

Can think of replacing the consensus layer with pubsub for this use-case, and only obeying updates signed with certain keys (perhaps for a pinning ring, the fact that the state is fully consistent among all peers rather quickly is not super important) (the more I think about this the nicer it sounds).

In any case, it is always the trust model that worries me.

If I'm a bit bored one day I'm going to write an RFC about replacing Raft with Pubsub

I will be super interested to read that RFC

hsanjuan · 2018-01-04T15:14:51Z

docs/motivation-and-use-cases.md

+- support for updating many uncoordinated nodes
+
+Thoughts:
+This is a particularly intersting use case as it requires more significantly new and unexplored functionality from cluster than any of the others here.  As @hsanjuan mentioned in his original write up "The key here is to understand what the trust model is in a pinning ring, how members gain and lose trust, and who can take what actions".  On a similar note a byzantine consensus protocol may greatly help keep the ring working smoothly even when some peers misbehave.  This use case also presents challenges regarding how to get nodes to update when they are managed by different individuals.  The current approach to updating requires all nodes to be shut down at the same tiem which may be unrealistic here.


Oh lol, I know see I had already written down stuff about trust model

hsanjuan · 2018-01-04T15:17:25Z

docs/motivation-and-use-cases.md

+Description:  An admin wishes to set up an ipfs node storing a mirror of ubuntu deb packages to support an apt transport that downloads deb packages from ipfs.  The admin wishes to download the packages and directory structure from one of the existing mirrors over http and store in ipfs.  The admin does not have a server with the 2TB of storage necessary to host the entire mirror and so cannot fit all packages on a single ipfs node.  However the admin does have access to a set of smaller servers (say 4 servers of 50GB) that together fulfill the total storage capacity of the mirror.  The admin installs ipfs-cluster on each server and then commands the cluster to download the mirror data.  During download the cluster allocates different pieces of the mirror directory to different machines, spreading load evenly.  If there is extra space on the servers then replication of packages is a bonus, but this is not a primary concern for this usecase.  The cluster hosting the mirror can be assumed stable, with a fixed number of servers all ran by a single admin or administrative body.  After packages are added to the cluster users will fetch packages by path name from the root hash (QmAAA.../mirrorDir1/mirrorDir2/package.deb).  
+
+Implied ipfs-cluster requirements
+- ipfs-cluster can handle importing, sharding and distributing across the cluster a file too big for one node, see PR #268.


To be exact, it's a tree too big for one node. Files are rather small in the apt repository. The problem of distributing a huge file among several peers and the problem of distributing a huge archive of small files among several peers can be approached slightly different, even though we probably fix them the same way (because they're large trees in the end).

Got it, thanks for the clarification.

hsanjuan · 2018-01-04T15:21:10Z

docs/motivation-and-use-cases.md

+- ipfs-cluster provides configurable load balancing across ipfs-cluster nodes (WAN cluster and LAN subclusters have different balancers), that can handle frequent changes to peersets
+- ipfs-clusters are easy to join and leave without having to think too much about set-up or generating errors
+- ipfs-cluster allows for retrieval of data across sub clusters that are not necessarily connected, except as two subtrees of the same larger cluster.
+ipfs-cluster allows individual peers to specify resource constraints 


This is closely related to files api and unixfs/fuse right? It's like mounting the ipfs fuse filesystem in /home and using cluster to make sure that the contents on it are propagated to all other machines/users.

I agree, when I pass through and list features that cluster requires I'll add things along these lines.

hsanjuan · 2018-01-04T15:24:56Z

docs/motivation-and-use-cases.md

+
+Designing ipfs-cluster to work this way has some benefits, including a familiar api for user interaction, the ability to use a cluster anywhere an ipfs node is used and the ability to make ipfs-clusters depend on other ipfs-clusters as the ipfs nodes that they coordinate.  This last property has the potential to make scaling ipfs-cluster easier; if large groups of participants can be abstracted away consensus peer group size can remain bounded as cluster participants grow arbitrarily.  It is not always the case that an ipfs api is the best user interface for adding files to ipfs-cluster.  Ipfs-cluster's support for behavior like per-pin replication configuration information, for example the current feature that pins can specify different replication factors, has no direct analogue to characteristics of an ipfs node exposed over the ipfs api.  As the ipfs api has no endpoint to encode this information, some kind of cluster specific interface is often useful, for example the current cluster `pin` command that allows setting replication factors.
+
+Though an ipfs vNode api IS partially implemented in ipfs-cluster to the point that ipfs-clusters can be composed, only the subset of the vNode api that ipfs-cluster needs to call in order to function has received attention.  There has been some discussion, in “Other open questions” https://github.com/hsanjuan/ipfsclusterspec/blob/master/README.md, about the difficulties involved in implementing a full ipfs vNode interface, and framing the vNode interface as a separate concern from the ipfs-cluster project’s primary goal of coordinating ipfs nodes.  Today emphasis on implementing the vNode interface exists to the extent that it enables composition of ipfs-clusters, and further work may be revisited.


perhap referene the composite cluster usecases PR. I kind of wanted to remove https://github.com/hsanjuan/ipfsclusterspec/blob/master/README.md because it's old and lacks context (I was asked to write it when I had little experience on ipfs/libp2p)

hsanjuan · 2019-01-14T13:40:40Z

@meiqimichelle I would like to close this. Can you check if we need to absorb any information somewhere else?

meiqimichelle · 2019-01-15T02:26:01Z

I will check!

ZenGround0 added 2 commits October 29, 2017 18:38

First draft of added context, planned use cases listed, first draft o…

52c1691

…f first use case written

fixing titles

a6b55c0

ghost assigned ZenGround0 Oct 29, 2017

ghost added the status/in-progress In progress label Oct 29, 2017

Improve readability

7f5625a

hsanjuan reviewed Nov 8, 2017

View reviewed changes

Adding ubuntu archive seeding cluster use case

3459fb8

License: MIT Signed-off-by: Wyatt Daviau <wdaviau@cs.stanford.edu>

Reorg and adding long-distance bidir backups

305b2fc

License: MIT Signed-off-by: Wyatt Daviau <wdaviau@cs.stanford.edu>

ZenGround0 changed the title ~~[WIP] Doc/add use cases Highly WIP~~ [WIP] Doc/add use cases Dec 18, 2017

Pinning rings

8b9f65f

License: MIT Signed-off-by: Wyatt Daviau <wdaviau@cs.stanford.edu>

ZenGround0 added 2 commits December 19, 2017 10:55

some blockchain descriptions

13f7e48

License: MIT Signed-off-by: Wyatt Daviau <wdaviau@cs.stanford.edu>

Adding to blockchain and dApp CDN

d2abac4

License: MIT Signed-off-by: Wyatt Daviau <wdaviau@cs.stanford.edu>

hsanjuan reviewed Jan 4, 2018

View reviewed changes

ZenGround0 mentioned this pull request Jan 4, 2018

Documentation Friday: Compiling use cases into a document #198

Closed

ZenGround0 mentioned this pull request Mar 16, 2018

Looking ahead: ipfs-cluster and pluggable consensus #351

Closed

Mr0grog mentioned this pull request Apr 26, 2018

Gather & organize links to existing docs and examples ipfs-inactive/docs#59

Closed

meiqimichelle mentioned this pull request Jan 16, 2019

Update content on Cluster homepage to include latest product conversations ipfs-cluster/ipfs-cluster-website#56

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Doc/add use cases #215

[WIP] Doc/add use cases #215

ZenGround0 commented Oct 29, 2017 •

edited

coveralls commented Oct 29, 2017

coveralls commented Oct 29, 2017

hsanjuan left a comment

hsanjuan Nov 8, 2017

ZenGround0 commented Nov 17, 2017

GitCop commented Dec 7, 2017

GitCop commented Dec 18, 2017

coveralls commented Dec 18, 2017

GitCop commented Dec 18, 2017

coveralls commented Dec 18, 2017

GitCop commented Dec 22, 2017

ZenGround0 commented Dec 22, 2017

coveralls commented Dec 22, 2017

hsanjuan left a comment

hsanjuan Jan 4, 2018

meiqimichelle Jun 14, 2018 •

edited

hsanjuan Jan 4, 2018

ZenGround0 Jan 4, 2018

hsanjuan Jan 4, 2018

hsanjuan Jan 4, 2018

hsanjuan Jan 4, 2018

hsanjuan Jan 4, 2018

hsanjuan Jan 4, 2018

ZenGround0 Jan 4, 2018

hsanjuan Jan 4, 2018

hsanjuan Jan 4, 2018

ZenGround0 Jan 4, 2018

hsanjuan Jan 4, 2018

ZenGround0 Jan 4, 2018 •

edited

hsanjuan Jan 4, 2018

hsanjuan commented Jan 14, 2019

meiqimichelle commented Jan 15, 2019

		Early discussion, again in https://github.com/ipfs/notes/issues/58, outlines a particular approach that remains relevant to discussion today: ipfs-cluster as a virtual ipfs node (vNode for short). The idea is that ipfs-cluster nodes could implement the ipfs api, only exposing state agreed upon by all nodes through consensus. A simple example: all nodes in the cluster agree on a single peer id, and after reaching agreement all nodes respond with this id to requests on their ipfs vNode id endpoint. A more useful example: an ipfs-cluster node gets an `add <file>` request through the ipfs vNode add endpoint, the cluster nodes coordinate adding this file, perhaps doing something clever like replicating file data across multiple nodes, and reach agreement that the data was added and pinned successfully via consensus. After all this occurs the api endpoint returns the normal message indicating success.

		Designing ipfs-cluster to work this way has some benefits, including a familiar api for user interaction, the ability to use a cluster anywhere an ipfs node is used and the ability to make ipfs-clusters depend on other ipfs-clusters as the ipfs nodes that they coordinate. This last property has the potential to make scaling ipfs-cluster easier; if large groups of participants can be abstracted away consensus peer group size can remain bounded as cluster participants grow arbitrarily. It is not always the case that an ipfs api is the best user interface for adding files to ipfs-cluster. If ipfs-cluster were to support behavior like per-pin replication configuration information, for example different pins specifying different replication factors as it does today, then the ipfs api has no endpoint to encode this information and some kind of cluster specific interface is needed (See discussion https://botbot.me/freenode/ipfs/2017-02-09/ here for a somewhat related conversation that includes discussion of more use cases).


		Description: An ipfs user with multiple machines wants to run their ipfs node with better replication or availability guarantees. The user creates an ipfs cluster across machines. Adding content to ipfs automatically triggers a pin in the cluster according to some predetermined replication strategy. The user advertises multiaddresses from all the ipfs daemons as multiaddresses of the ipfs node mirror.

		Thoughts: This use case, like others, requires some mechanism for automatic pinning upon adding to ipfs. Depending on the adding mechanism this might be straightforward functionality to add to cluster, for example cluster's ipfs proxy add endpoint will probably eventually do this by default. However in some cases, such as if each machine writes to ipfs over the fuse interface, this would be more difficult. A mirrored node could potentially advertise itself as an ipfs node and advertise the cluster ipfs proxy endpoint addresses as its multiaddresses.


		These are some WIP ipfs-cluster use case sketches. These are not formal use cases and are more accurately groups of related use cases; they could be further decomposed into the more narrowly scoped operations found in formal use cases.

		## ipfs node mirror


		2. For the second example you could imagine that the block explorer is a dApp with users running ipfs and caching blockchain data as they look it up and the ipfs-cluster acts as a permanent store for slower lookups (similar pattern to use case below). If this becomes a serious use case we should investigate pain points that currently exist for block explorer websites to get a better picture of how cluster would fit in.

		3. This is a very undeveloped idea (ipld selectors don't exist yet, I haven't seen anyone ask for this) based on my impression that quick, expressive merkle dag searches would be valuable. I should investigate work that exists along these lines (ex, how do current block explorer's do queries over merkle trees?), and how ipld selectors compare as a next step.


		Designing ipfs-cluster to work this way has some benefits, including a familiar api for user interaction, the ability to use a cluster anywhere an ipfs node is used and the ability to make ipfs-clusters depend on other ipfs-clusters as the ipfs nodes that they coordinate. This last property has the potential to make scaling ipfs-cluster easier; if large groups of participants can be abstracted away consensus peer group size can remain bounded as cluster participants grow arbitrarily. It is not always the case that an ipfs api is the best user interface for adding files to ipfs-cluster. Ipfs-cluster's support for behavior like per-pin replication configuration information, for example the current feature that pins can specify different replication factors, has no direct analogue to characteristics of an ipfs node exposed over the ipfs api. As the ipfs api has no endpoint to encode this information, some kind of cluster specific interface is often useful, for example the current cluster `pin` command that allows setting replication factors.

		Though an ipfs vNode api IS partially implemented in ipfs-cluster to the point that ipfs-clusters can be composed, only the subset of the vNode api that ipfs-cluster needs to call in order to function has received attention. There has been some discussion, in “Other open questions” https://github.com/hsanjuan/ipfsclusterspec/blob/master/README.md, about the difficulties involved in implementing a full ipfs vNode interface, and framing the vNode interface as a separate concern from the ipfs-cluster project’s primary goal of coordinating ipfs nodes. Today emphasis on implementing the vNode interface exists to the extent that it enables composition of ipfs-clusters, and further work may be revisited.

[WIP] Doc/add use cases #215

Are you sure you want to change the base?

[WIP] Doc/add use cases #215

Conversation

ZenGround0 commented Oct 29, 2017 • edited

coveralls commented Oct 29, 2017

coveralls commented Oct 29, 2017

hsanjuan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZenGround0 commented Nov 17, 2017

GitCop commented Dec 7, 2017

GitCop commented Dec 18, 2017

coveralls commented Dec 18, 2017

GitCop commented Dec 18, 2017

coveralls commented Dec 18, 2017

GitCop commented Dec 22, 2017

ZenGround0 commented Dec 22, 2017

coveralls commented Dec 22, 2017

hsanjuan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

meiqimichelle Jun 14, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZenGround0 Jan 4, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hsanjuan commented Jan 14, 2019

meiqimichelle commented Jan 15, 2019

ZenGround0 commented Oct 29, 2017 •

edited

meiqimichelle Jun 14, 2018 •

edited

ZenGround0 Jan 4, 2018 •

edited