Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support remote URLs for loading scripts #55

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Support remote URLs for loading scripts #55

wants to merge 7 commits into from

Conversation

mfitzp
Copy link
Member

@mfitzp mfitzp commented Aug 27, 2015

This commit supports loading scripts from remote URLs
and zip or gzip files. Archives are automatically de-compressed
and all contained files imported.

Folders are now walked recursively (as archives commonly have a
first level folder) and all found scripts imported.

Will need to handle the case of pre-existing scripts (by name?)
to be able to support automatic updating without creating
multiple script objects.

@mfitzp
Copy link
Member Author

mfitzp commented Aug 27, 2015

Not sure what the best way to handle updating scripts (matching by name?). Would be nice to have it automatically creating new versions from the command-line too as it would make it possible for people to auto-update scripts using cron.

@Chris7
Copy link
Member

Chris7 commented Aug 28, 2015

The issue is that the storage backend will change the name of the file automatically on saving it. We'll need to add in a method on the storage class to detect this and handle it appropriately. I'm not sure at the moment what is the most robust way to catch updates as well other than something like (script_group, filename)

@mfitzp
Copy link
Member Author

mfitzp commented Aug 29, 2015

I guess this only really applies to the command line interface though? When updating through the admin it's up to the person doing it to update the correct script. We just need a way to determine that automatically added scripts are newer versions of older scripts. I wonder if we can do something clever with the source (URL/folder) or the script filename (before import)?

@mfitzp
Copy link
Member Author

mfitzp commented Aug 31, 2015

Unfortunately we won't have a script group when loading from the command line. I guess we could provide an option to supply one (probably makes sense) and then in the absence of once check the filename against all script groups. Since the slug is a unique feature perhaps creating the script slug from the filename would give us a solution? Not sure if we would want to copy this approach to the admin UI also?

That way at least repeatedly importing from the command line would correctly create multiple versions rather than multiple scripts.

@Chris7
Copy link
Member

Chris7 commented Sep 8, 2015

You can already supply a script group on the command line via --group. However, for bulk loading it's useless since you can't divvy up scripts.

How about a user prompt and a --no-input option like collectstatic? If there is a conflict via name -- ask the user to confirm if it's a new or updated script. If they already know the answer, --no-input will let them skip it all.

@mfitzp
Copy link
Member Author

mfitzp commented Sep 8, 2015

Sounds good, I'll have a look at how to implement this based off collectstatic.

This commit supports loading scripts from remote URLs
and zip or gzip files. Archives are automatically de-compressed
and all contained files imported.

Folders are now walked recursively (as archives commonly have a
first level folder) and all found scripts imported.

Will need to handle the case of pre-existing scripts (by name?)
to be able to support automatic updating without creating
multiple script objects.
@mfitzp
Copy link
Member Author

mfitzp commented Sep 9, 2015

Rebased on current master.

This catches the source script basename and passes it into
add_wooey_script allowing for version updating from the
commandline. Using this re-adding the same scripts over and over
creates new versions (we will want to suppress this in future
by using a hash of some kind).
@mfitzp
Copy link
Member Author

mfitzp commented Sep 9, 2015

Ah. Adding via the commandline leaves multiple versions with default=True which then causes an issue when using get to create the view. Should be easy enough to fix, would be worth adding a unique constraint on that field?

@Chris7
Copy link
Member

Chris7 commented Sep 9, 2015

I don't know how unique would help us since it's just a boolean and not taking things like script, script_version as a combined key.

My current approach is to just set the old one to non-default and the current to a new default. I think we could automate it with a pre_save hook and check if default_version is changing.

@Chris7
Copy link
Member

Chris7 commented Sep 9, 2015

You could just do Script.latest_version.default_version = False as well, since that will automatically get the current default_version.

@mfitzp
Copy link
Member Author

mfitzp commented Sep 9, 2015

Ah forgot about latest_version. The commit just uses the current_versions.update(default_version=False) since current_versions is lying around anyway (the query is constructed up above).

@@ -217,7 +222,7 @@ def add_wooey_script(script_version=None, script_path=None, group=None):
if isinstance(group, ScriptGroup):
group = group.group_name
if group is None:
group = 'Wooey Scripts'
group = 'Scripts'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this point to our config for the default group?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should be WOOEY_DEFAULT_SCRIPT_GROUP, will fix.

@mfitzp
Copy link
Member Author

mfitzp commented Sep 9, 2015

The last bit to implement I guess is the confirmation check for new versions.

I've also got some suggested changes for the name parsing (from the filename) to make things a bit more attractive. There is a standard Python package titlecase that can handle this. See below —

screen shot 2015-09-09 at 15 29 05

screen shot 2015-09-09 at 15 30 15

...obviously it isn't correct for the fasta in the example (because the source file is all lowercase). But if you have a file named with something_something_FASTA it won't lowercase it (like Python stdlib .title()).

@mfitzp
Copy link
Member Author

mfitzp commented Sep 9, 2015

Sorry, I meant doing a combined key with the script and unique_version fields. It would stop the database getting into the state where the frontend breaks (due to duplicate unique ScriptVersions).

@Chris7
Copy link
Member

Chris7 commented Sep 9, 2015

👍 on the titlecase. What modelfield supports a combined key with unique=True. Are you talking about just adding an extra slug like field? I think it'd be simpler just to do it in the clean method of the model and raise an error from there.

@mfitzp
Copy link
Member Author

mfitzp commented Sep 9, 2015

I was thinking of the Meta setting for unique together: https://docs.djangoproject.com/en/dev/ref/models/options/#unique-together

I'll add the titlecase into this PR. It adds a dependency on the titlecase package (it's on PyPi) but I don't think that's a biggie.

This automatically applies title case to scripts loaded from the
command line. The resulting name is used for matching against
existing entries in the database, so having a uniform method
to generate this is important.

Uses the titlecase library which applies standard transformations
while leaving pre-capitalized words intact.
@mfitzp
Copy link
Member Author

mfitzp commented Sep 9, 2015

There is a weird error on the build, not sure what that is about (Django 1.6 only).

@Chris7
Copy link
Member

Chris7 commented Sep 9, 2015

It broke from this:

https://travis-ci.org/wooey/Wooey/jobs/79476685

Something with using the wooey settings. Maybe fix it by adding the relevant setting to the test settings?

@mfitzp
Copy link
Member Author

mfitzp commented Sep 9, 2015

Yeah looking at it I think it's the translation string on the default script group _('Scripts'). Will see if can force the string as before.

@mfitzp
Copy link
Member Author

mfitzp commented Sep 9, 2015

Removing lazy evaluation works but thinking about it it will break the dynamic translations on the front page so I used the non-lazy approach for the WOOEY_DEFAULT_SCRIPT_GROUP since that should be evaluated only once based on the default locale.

The tagline is still lazy evaluated (so it's dynamic for visitors browser settings). I removed the translation off "Wooey!" since it's a brand-name (of sorts) so we won't want it to change. If people specify the names themselves they're obviously free to override these.

For the WOOEY_DEFAULT_SCRIPT_GROUP there might be the argument that this should be translated on the other end coming out of the database (i.e. stored always as "Scripts" but translated). I'm not sure what the best approach is here. I'm thinking for example if a site is (for whatever reason) set up to have a default French locale this will auto-translate the Scripts script group to French, and this will then be shown to everyone who visits in French.

Evaluate Scripts non-lazy in the wooey_settings file to use the
default system locale for this. Note that this will be overridden
and ignored if this value is set specifically by the user.

Translation has been removed from Wooey! as it's a brand-name.

The site tag remains translated with lazy evaluation, so it can
automatically match the visitors browser settings.
@mfitzp
Copy link
Member Author

mfitzp commented Sep 9, 2015

So (again) the last bit to implement is the confirmation check for new versions.

I've been testing this in combination with the new clinto argparse getter and it seems to be working on anything I throw at it which is good. I'm going to take some time to go over the biocode script repo and update the file parameters so we can detect them correctly — then we'll have a good source for examples of "load lots of scripts quickly".

@Chris7
Copy link
Member

Chris7 commented Sep 11, 2015

👍

@Chris7
Copy link
Member

Chris7 commented Sep 11, 2015

The database should be agnostic to the translations. It will be stored as whatever DEFAULT_SCRIPT_GROUP is set to inside the database (which will be English), and if a translation is provided for the display out, then it will be displayed.

The only caveat obviously is if they change that default group on the database, then another will be created that is the default value again. However, I view this as a usage error and should be corrected by indicating they change the value in their settings as opposed to the database.

@Chris7 Chris7 added this to the 0.9.2 milestone Jan 6, 2016
@Chris7 Chris7 modified the milestones: 0.9.3, 0.9.2 Jan 20, 2016
@mfitzp
Copy link
Member Author

mfitzp commented Mar 10, 2016

To update on what I was saying 6(!) months ago. I hit a problem with the biocode script repo in that most of the scripts required the ability to import relative files (i.e. they weren't self contained but relied on utils/base scripts that were packaged with them). This comes back to out support for this sort of thing generally but is a separate issue to this one.

I'll take a further look at the new version confirmation check bit (once I've figured out what I meant by that) and we should be good to merge. Surprised it is still able ;)

@Chris7
Copy link
Member

Chris7 commented Mar 10, 2016

Maybe we can have a new model that is Worker Settings that can hold additional environmental variables so users can add paths/etc.

A simple model would be:

class WorkerEnvironmentVariables():
 settings = FK(WorkerSettings)
 name = xxx
 value = xxx
 append = boolean (to append to name or to overwrite it)

class WorkerSettings():
  settings_name = textfield (for naming the settings group)

Then the worker can just do os.environ[name] = value and such

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants