Skip to content
This repository has been archived by the owner on Sep 6, 2020. It is now read-only.

Long build times #97

Open
basz opened this issue Jan 17, 2015 · 18 comments
Open

Long build times #97

basz opened this issue Jan 17, 2015 · 18 comments

Comments

@basz
Copy link

basz commented Jan 17, 2015

Hi,

I haven't use box much but did try to package some zendframework applications with it. I noticed (via running it verbose) that as more files are included took a little bit longer per file to do so. And in the end it was at crawling speed. Took about 1.5 hour to a medium sized project. (a few thousand files)

I noticed that the resulting phar file that is being written to kept changing file size while the build took place. Is this a known issue with a work around?

Could it be related to https://scrutinizer-ci.com/blog/composer-gc-performance-who-is-affected-too ?

Bas

@Bilge
Copy link

Bilge commented Feb 11, 2015

Observed exactly the same behaviour. This is too slow to be usable in practice. This is despite disabling compression and compacting. Running with php -d zend.enable_gc=0 did not help markedly if at all.

@padraic
Copy link

padraic commented Feb 22, 2015

Only ever used it on small items, like CLI apps. Was file size growing constantly, or reducing and then growing back up to slightly bigger each time?

@basz
Copy link
Author

basz commented Feb 22, 2015

Hi @padraic, the latter. File size goes from zero to bigger constantly, it looks like the phar is being rewritten every time a file is added. Box is unusable for me on these kind of projects, so for now rolling my own phar building, which still takes 6-8 minutes to complete.

@kherge
Copy link
Contributor

kherge commented Mar 4, 2015

When you run box build, the entire archive is rebuilt from scratch.

Unfortunately, incremental building isn't supported in this version of box, but it is planned for the next version (#84).

@kherge kherge closed this as completed Mar 4, 2015
@Bilge
Copy link

Bilge commented Mar 4, 2015

Incremental building does not sound like it tackles the core problem here even though it may improve performance in a different way.

@basz
Copy link
Author

basz commented Mar 4, 2015

I am not doing an incremental build here... Shall I create a short movie demonstrating the issue?

Op 4 mrt. 2015 om 17:17 heeft Kevin Herrera notifications@github.com het volgende geschreven:

When you run box build, the entire archive is rebuilt from scratch.

Unfortunately, incremental building isn't supported in this version of box, but it is planned for the next version (#84).


Reply to this email directly or view it on GitHub.

@kherge
Copy link
Contributor

kherge commented Mar 4, 2015

Sure, but I think I might understand the issue. As you build the *.phar file, you notice the file size fluctuating as the build is running, is that correct? I think zip does the same thing. I'm guessing this is just how the phar extension is choosing to build the phar file, which I have no control over. Box itself only removes and recreates the phar once in the very beginning of the process.

@basz
Copy link
Author

basz commented Mar 4, 2015

Yes correct, that's the issue... However I don't think zip does behave in a similar fashion - why would it - as it is perfectly possible to just append data to files. It -feels- like an incompatible configuration option somewhere, because I can't be the first one to try this including large libraries such as zf2 or symfony can I? Anyway, i was shopping someone would recognize a quick solution - not really a problem for me as at this time I was only doing unpaid experimentations. thanks

@arjanvdbos
Copy link

Same problem here, mid size ZF2 + Doctrine project takes about 1,5 hours to build. I really love the simplicity of box, but this makes it useless for me.

It seems that Phing's PharTask had the same problem (https://www.phing.info/trac/ticket/782), they solved it by using Phar::buildFromIterator() which should massively speed up phar generation.

@Bilge
Copy link

Bilge commented Apr 5, 2015

@arjanvdbos That seems like some pertinent information. I hope @kherge is taking note.

@kherge
Copy link
Contributor

kherge commented Apr 6, 2015

The bug ticket mentions that they switched from a one-by-one approach to buildFromIterator(). The one-by-one approach is required by Box in order to process each individual file. While we may likely see a boost in performance, we won't be able to make any changes to those files.

I can probably make this work, but you may only see the performance improvement when none of the file processors available to Box are used.

@kherge kherge reopened this Apr 6, 2015
@basz
Copy link
Author

basz commented Apr 6, 2015

I'm not familiar with the steps box takes, but it might work and still be a lot faster if you copy and process files first to a tmp dir and then import them with a buildFromIterator.

Op 6 apr. 2015 om 17:12 heeft Kevin Herrera notifications@github.com het volgende geschreven:

The bug ticket mentions that they switched from a one-by-one approach to buildFromIterator(). The one-by-one approach is required by Box in order to process each individual file. While we may likely see a boost in performance, we won't be able to make any changes to those files.

I can probably make this work, but you may only see the performance improvement when none of the file processors available to Box are used.


Reply to this email directly or view it on GitHub.

@kherge
Copy link
Contributor

kherge commented Apr 6, 2015

Hah, I like your idea much better!

The current process is something like this:

for each file listed in box.json
    read its contents into memory
    process the contents
    add it to the phar using addFromString()
    move on to next file

The new one will look like this:

for each file listed in box.json
    copy it to a temporary directory
    process the contents
    move on to next file

add the temporary directory using buildFromIterator()

@arjanvdbos
Copy link

Good to see this issue is reopened!

I didn't look at the internal working of Box, but in my opinion it is not required to copy and process the files first, and then generate the Phar.
You could also implement a sort of collection class which implements a Iterator with a callback. In the callback you can handle all the pocessing stuf.

webdevvie added a commit to webdevvie/box2-lib that referenced this issue Jun 6, 2016
…le replacement:

Issue:  box-project/box2#97

Required for the other PR I am making for box2
@webdevvie
Copy link

webdevvie commented Jun 6, 2016

Hi,

I wanted to reduce my build times from 325 seconds a bit :) So I built two branches for both box2 and box2-lib

https://github.com/webdevvie/box2/tree/faster-adding-with-excluded-regexp
https://github.com/webdevvie/box2-lib/tree/faster-adding-with-excluded-regexp

It adds a config value to the box.json file for box2:
"exclude-from-value-replace":[ "/^vendor(.*)/i", "/^src(.*)/i" ]

and uses an ArrayItterator to add the files from a file queue to the phar file.

This reduced the build time for my project to 10 seconds. Which is more than acceptable for me.

I hope this is useful to you. I ran the unit tests for box2, and they are green. But the box2-lib unittests throw a warning about trying to read a fopen(/does/not/exist) but the tests are still green.

Hope this helps out.

My branches:
https://github.com/webdevvie/box2/tree/faster-adding-with-excluded-regexp
https://github.com/webdevvie/box2-lib/tree/faster-adding-with-excluded-regexp

@kalehrishi
Copy link

Anyone working to get this feature to main build? @webdevvie How can I use your version?

@webdevvie
Copy link

@kalehrishi I added the specific fork and branches to my composer.json

i added this to my require array:

        "kherge/box": "dev-faster-adding-with-excluded-regexp as 2.7.2",
        "herrera-io/box": "dev-faster-adding-with-excluded-regexp as 1.6",

then added this:

"repositories": [
        {
          "type": "vcs",
          "url": "git@github.com:webdevvie/box2.git"
        },
        {
          "type": "vcs",
          "url": "git@github.com:webdevvie/box2-lib.git"
        }
    ],

until the branches are merged or some other solution is added.

There are several open pull requests for both projects for over a year now so I think this project is in the phantom zone or something.

@kalehrishi
Copy link

@webdevvie Thanks a lot.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants