Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sql table prefix configurable #172

Open
dbu opened this issue Jan 20, 2014 · 5 comments
Open

Make sql table prefix configurable #172

dbu opened this issue Jan 20, 2014 · 5 comments
Labels

Comments

@dbu
Copy link
Member

dbu commented Jan 20, 2014

i would say the actual table names do not need to be configurable, just the prefix should really be enough. we could convert the table "local" names to constants in RepositorySchema however. or simply have getter for all those names, then one could even use a custom schema with custom table names.

the $options to the RepositorySchema should allow a table prefix. the QOMWalker should be injected the schema rather than instantiating it itself so it can get a schema with the correct prefix. the schema should also have a getter for the prefix so the walker can figure out the right table name.

the client would need a setter to be injected a custom created RepositorySchema to use it when building the QOMWalker, and of course for the table names of its own queries.

the init command also builds the schema - that one should have arguments for the options to the schema, rather than pass an empty array. in the phpcr-bundle, those options should go away again and come from the symfony configuration instead.

for the tests, the Jackalope\TestCase class should read the prefix from the globals, as we do for other configuration options, so you can define them in the phpunit.xml file.

@awildeep
Copy link

While this change is being made, making the table names themselves flexible could easily be tackled (as long as a good plan is laid out), that way a user could customize a specific table name a specific resource. For instance a format might be something like [TablePrefix][TableName], which covers already existing use cases such as phpcr_workspaces, or in my case wanting phpcr.workspaces (the dot notation for schema usage). A simple array structure passed into the RepositorySchema could look like:

$options = array (
'table_prefix' = 'phpcr_',
'table_names' = array(
    'binarydata'            = 'binarydata',
    'internal_index_types'  = 'internal_index_types',
    'namespaces'            = 'namespaces',
    'nodes'                 = 'nodes',
    'nodes_references'      = 'nodes_references',
    'nodes_weakreferences'  = 'nodes_weakreferences',
    'type_childs'           = 'type_childs',
    'type_nodes'            = 'type_nodes',
    'type_props'            = 'type_props',
    'workspaces'            = 'workspaces'
);

After thinking about this a bit further maybe a class that does this mapping, that is injected into any object that needs to access tables in in order. That way should the need arise in the future even columns could be customized for fringe use-cases.

class RepositorySchemaMap {
    private $prefix;
    private $table_name_map;
    public function __construct(string $prefix = 'phpcr_', array $table_name_map = array(    
        'binarydata'            = 'binarydata',
        'internal_index_types'  = 'internal_index_types',
        'namespaces'            = 'namespaces',
        'nodes'                 = 'nodes',
        'nodes_references'      = 'nodes_references',
        'nodes_weakreferences'  = 'nodes_weakreferences',
        'type_childs'           = 'type_childs',
        'type_nodes'            = 'type_nodes',
        'type_props'            = 'type_props',
        'workspaces'            = 'workspaces'
    ))
    {
         $this->prefix = $prefix;
         $this->table_name_map = $table_name_map;
         //Check for default values 
    }
    public function __get($name) {
        return $this->prefix . $this->table_name_map[$name];
    }
}

This is not exactly how I would like it to be, but just a quick sketch of an idea. This could then be injected into RepositorySchema, QOMWalker, etc, and we gain configurability. Magic getter is not required, but it makes the example shorter, a more explicit version would be my preference.

Edit, one more quick point, there are sequences defined in Client as well. These should also be configurable (even though they are PostgreSQL or PG equivalent DB specific) within the scope of any such mapper. Sequences can be mapped into schemas just as easily as a table.

@awildeep
Copy link

The more I look into this, the more I wonder if we aren't just reinventing Doctrine ORM. This is likely a faster implementation, but I am wondering if it is worth it.

@dbu
Copy link
Member Author

dbu commented Jan 21, 2014

i agree with your proposal for the map class, more elegant than passing the RepositorySchema around left and right.

regarding ORM: PHPCR is not at all an ORM, but (ab)using relational databases to store nosql data. in PHPCR you don't have your document classes but only Node objects. the PHPCR implementation needs to store the Node objects - and this implementation transforms that into database tables. did you read the PHPCR tutorial here? https://github.com/phpcr/phpcr-docs/blob/master/tutorial/Tutorial.md

on top of PHPCR we build Doctrine PHPCR-ODM which is member of the Doctrine family. the advantage of it over the ORM for content management is that trees are a native concept. all content is organized in a tree so you can use paths to identify any content, or for example link between arbitrary objects if you want to. you don't have this kind of flexibility in the ORM. we also leverage the versioning and other advanced features of PHPCR and have built a translation into the core of it.

so, i hope that did not sound too defensive. just trying to explain why we think there is use for PHPCR and PHPCR-ODM besides Doctrine ORM :-)

@awildeep
Copy link

@dbu no offense taken. All I meant to say was, it just "feels" like we are building an abstraction layer that does what an ORM already handles. For instance there is already code in place that handles sequences as a bit of an exception for PostgreSQL. I am not saying we should use DoctrineORM (it's probably too heavy anyway), but we may glean some insight from ORM projects when implementing this change.

Ultimately any DB specifics should be unknown to the RepositorySchema, QOMWalker, etc, and tied back to some form of map object that could in turn then be overwritten, mocked, or manipulated.

In my opinion, the mapper class(es) should know about:

  • Table prefixes
  • Table name mapping by some form of key, or API
  • Any specific table space requirements that are DB dependent and required during creation (InnoDB, UTF-8 encoding, etc)
  • DB specific requirements for things like sequence names in the case of PostgreSQL, so these could be customizable. Sequences can reside in schemas for instance.

It might be nice if there was a way to map the column names as well, but that could be another round of refactoring.

@dbu
Copy link
Member Author

dbu commented Jan 21, 2014

absolutely agree that the Client class could use quite a bit of cleanup.
it just grew to that size. looking how doctrine solved a problem makes
sense yes - maybe doctrine dbal could even do more than we use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants