New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move all documentation hosting to stable URLs #2797
Comments
Just to write up some progress on this. I started a branch that removes the “no-reference” redirects and attempts to host documentation from those URLs. So, for example:
Hosts the documentation from the files in S3 at:
Unfortunately, the We need to generate a new set of documentation for the default documentation set, with a The issue here is that we don’t know what the “latest” version is at build time. It could be a default branch version, a pre-release, or a stable release. |
Progress is in the |
As discussed, I've had a look into how many packages opt-in to generate docs but don't have any releases and the number is quite small: 14. However, there's a different set of packages that is significantly larger that would currently be affected if we only ask Google to index packages with release docs, and that's packages that generate docs but have not had a release since they opted-into doc generation. They also don't have release docs. There are 95 of those, which is ~15% of all packages requesting docs (14 of 629). While not ideal, I think as a first stab it'd ok if we didn't support having those indexed by Google off the bat since it's going to be quite tricky to do so. They're no worse off than they are currently, where we don't have them indexed by Google on any version, and the remedy is actually quite easy: simply tag a release. Queries: -- packages that generate docs on def branch and have no releases whatsoever
select p.url from
packages p join (
select v.package_id
from versions v
where v.package_id in (
-- has docs on default branch
select distinct v.package_id
from versions v
where
v.spi_manifest::text like '%documentation_targets%'
and latest is not null
)
group by v.package_id
having count(*) = 1
) t on p.id = t.package_id
order by p.url -- packages that generate docs but have no latest release docs
select distinct p.url
from packages p
join versions v on v.package_id = p.id
where v.spi_manifest::text like '%documentation_targets%'
and latest is not null
group by p.url
having count(*) = 1
order by p.url |
We could kick off re-builds of these for the latest stable release. |
Unfortunately that won't work, because they're not opted into doc generation on those old tags. Only a new release would actually have an |
Ah, of course. That's a shame. |
I've thought of a pretty simple way for us to determine from within the builder itself whether a default branch build with docs should generate the docs for "latest release" or not. We can just list the tags and run them through It's perhaps not 100% ideal in that the data doesn't come from the server but then again the source of truth is actually the repository we've checked out, so we're definitely looking in the right place. |
Yes that would work. One thing we should consider is sending back with the API call and storing in our database is whether it generated a We should also not use the word |
From https://git-scm.com/docs/git-check-ref-format
So we could do |
I've looked into this a bit this morning on the builder side and there are a few tricky parts we need to consider:
Overall this is quite the change to how we're generating docs and I wish there was an alternative way to duplicate a doc set that we could simply tack on to the existing process. It's really unfortunate that the doc archives have structural components embedded that don't make them re-hostable. I don't know if this is fundamentally impossible to change but I wonder if it'd be worth at least bringing up with the docc folks. Maybe there's an upstream change possible such that the doc gen complexities on our end could at least be only temporary if not outright avoided. |
Another route worth exploring: Right now we generate docs as follows:
It is my understanding that both processes essentially call out to I'm pretty sure we could also generate docs as follows in case of SwiftPM based builds:
The advantage would be that both now have the same second stage, If Both paths of doc generation would generate a doc archive and then we run two passes of The downside of this process is that we might be diverging from how users generate docs and therefore make it harder to compare results in case there are problems (probably not a huge downside tbh. |
It still feels like we're fighting a downstream problem that could perhaps be better addressed upstream in docc. For example, I've generated docs for the same package twice purely with different base paths. The only difference in the output was the base paths in the If somehow instead of taking full base paths the Pinging @ethan-kusters and @franklinsch et al - is this something worth discussing? |
For reference, I ended up running
on an archive I created via
and the diff of one of the 11c11
< "/swiftpackageindex/semanticversion/0.4.0/favicon.ico">
---
> "/swiftpackageindex/semanticversion/~release/favicon.ico">
13c13
< "/swiftpackageindex/semanticversion/0.4.0/favicon.svg" color=
---
> "/swiftpackageindex/semanticversion/~release/favicon.svg" color=
18c18
< var baseUrl = "/swiftpackageindex/semanticversion/0.4.0/"
---
> var baseUrl = "/swiftpackageindex/semanticversion/~release/"
19a20,27
> <script defer="defer" src=
> "/swiftpackageindex/semanticversion/~release/js/chunk-vendors.bdb7cbba.js"
> type="text/javascript">
> </script>
> <script defer="defer" src=
> "/swiftpackageindex/semanticversion/~release/js/index.2871ffbd.js"
> type="text/javascript">
> </script>
21,135c29
< "/swiftpackageindex/semanticversion/0.4.0/css/chunk-c0335d80.10a2f091.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/documentation-topic.1d1eec04.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/documentation-topic~topic.b6287bcf.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/documentation-topic~topic~tutorials-overview.d6f5411c.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/topic.d8c126f3.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/tutorials-overview.c249c765.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/chunk-2d0d3105.cd72cc8e.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/chunk-c0335d80.76a68cc5.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/documentation-topic.57e91f8a.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/documentation-topic~topic.1679ec90.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/documentation-topic~topic~tutorials-overview.90c61522.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-bash.1b52852f.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-c.d1db3f17.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-cpp.eaddddbe.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-css.75eab1fe.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-custom-markdown.7cffc4b3.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-custom-swift.5cda5c20.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-diff.62d66733.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-http.163e45b6.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-java.8326d9d8.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-javascript.acb8a8eb.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-json.471128d2.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-llvm.6100b125.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-markdown.90077643.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-objectivec.bcdf5156.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-perl.757d7b6f.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-php.cc8d6c27.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-python.c214ed92.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-ruby.f889d392.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-scss.62ee18da.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-shell.dd7f411f.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-swift.84f3e88c.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-xml.9c3688c7.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/topic.8cd0c0c4.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/tutorials-overview.2a32cd6f.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/index.038e887c.css"
< rel="preload" as="style">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/chunk-vendors.ba2dd0cb.js"
< rel="preload" as="script">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/index.e8a5d294.js"
< rel="preload" as="script">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/index.038e887c.css"
---
> "/swiftpackageindex/semanticversion/~release/css/index.ff036a9e.css"
137,139d30
< <style type="text/css">
< .noscript{font-family:"SF Pro Display","SF Pro Icons","Helvetica Neue",Helvetica,Arial,sans-serif;margin:92px auto 140px auto;text-align:center;width:980px}.noscript-title{color:#111;font-size:48px;font-weight:600;letter-spacing:-.003em;line-height:1.08365;margin:0 auto 54px auto;width:502px}@media only screen and (max-width:1068px){.noscript{margin:90px auto 120px auto;width:692px}.noscript-title{font-size:40px;letter-spacing:0;line-height:1.1;margin:0 auto 45px auto;width:420px}}@media only screen and (max-width:735px){.noscript{margin:45px auto 60px auto;width:87.5%}.noscript-title{font-size:32px;letter-spacing:.004em;line-height:1.125;margin:0 auto 35px auto;max-width:330px;width:auto}}#loading-placeholder{display:none}
< </style>
142,148c33
< <noscript>
< <div class="noscript">
< <h1 class="noscript-title">This page requires JavaScript.</h1>
< <p>Please turn on JavaScript in your browser and refresh the page
< to view its content.</p>
< </div>
< </noscript>
---
> <noscript>[object Module]</noscript>
150,156d34
< <script src=
< "/swiftpackageindex/semanticversion/0.4.0/js/chunk-vendors.ba2dd0cb.js"
< type="text/javascript">
< </script><script src=
< "/swiftpackageindex/semanticversion/0.4.0/js/index.e8a5d294.js"
< type="text/javascript">
< </script> |
I've tested the overhead of generating a It adds around 2 minutes of additional time, including all doc generation and uploading the 128 MB of docs. Subsequent processing doesn't impact our time limit and it happens asynchronously. The total time for swift-syntax is just under 6 minutes, so we're well clear of the 10min limit here. NB: swift-syntax is likely one of the most critical packages but there's a chance that a package with a slower build time might be more at risk going over the limit. I'll try to get typical build times for packages with docs. |
FYI, I've chosen |
New run timed out, this is going to be a problem: https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6100021479 |
We could of course increase the timeout but the problem with that is that it'll then cause more trouble when he hit a slow build and make the delays worse. However, it should be possible to set the timeout dynamically based on package details such that we could give only the packages that are generating docs more time. That in combination with the new machines should prevent us from running into timeout problems here. |
Looking at the slowest doc builds, swift-syntax isn't actually in the top 10:
The slowest at 6min is actually already sitting at 7.5min job duration (due to cloning, reporting etc overhead), so we're awfully close or over if we duplicate doc generation. |
FYI, I've had to change the url fragment to |
I've looked into the documentation routing issue we discussed on Monday and unless I'm mistaken (which I hope 😅), the suggested solution outlined in the Custom Routing docs and in David's WWDC video won't work for us.
If I route this to a doc archive without a base path (i.e. generated simply via
Now in the case of a single doc site, the custom routing docs simply route all
However, we can't do that, because we're hosting hundreds of archives, and different versions, and so we need the base path to know which doc archive to route to. I've cross-posted this to the DocWG's slack here: https://swift-open-source.slack.com/archives/C04PCMXMBD0/p1707998191445999 |
I've created a branch The rewriting seems to kick in ok (looking at the source), however the Vue app ends up in an error state for some reason: Not sure what's going on there. There are no errors in the console (unless I'm looking in the wrong place) and there are no 404s or anything in the server logs either. Needs more investigation. |
What's interesting is that
and
return the same html except for
and the former is displaying correctly. |
I have a working doc hosting setup now from a stable URL via rewrites that doesn't require us to regenerate docs nor redirect. The one downside is that we need an additional "anchor" in the doc url in order to distinguish doc routes and make them routable in our DocC proxy. Doc urls with references are unchanged: ✅ http://localhost:8080/SwiftPackageIndex/SemanticVersion/0.4.0/documentation/semanticversion Doc var baseUrl = "/swiftpackageindex/semanticversion/0.4.0/"
</script>
<link href="/swiftpackageindex/semanticversion/0.4.0/css/chunk-c0335d80.10a2f091.css" rel="prefetch"/> Default docs could be hosted as ✅ http://localhost:8080/SwiftPackageIndex/SemanticVersion/current/documentation/semanticversion Doc var baseUrl = "/swiftpackageindex/semanticversion/current/"
</script>
<link href="/swiftpackageindex/semanticversion/current/css/chunk-c0335d80.10a2f091.css" rel="prefetch"/> I was hoping to get ❌ http://localhost:8080/SwiftPackageIndex/SemanticVersion/documentation/semanticversion to work, but it doesn't. The problem here is that ✅ http://localhost:8080/SwiftPackageIndex/SemanticVersion/documentation/documentation/semanticversion This does work but feels like an odd url. In general, any path element of our choosing will do. For instance ✅ http://localhost:8080/SwiftPackageIndex/SemanticVersion/_/documentation/semanticversion The reason we can't really drop the "anchor" is that it would overlay resource paths with our existing resources. For example, the list of DocC resource paths is
Some of these collide with our static resources. I have not tried but I could imagine we might be able to make this work if we either moved our resources to another path or in our routes checked for resources among both docc and our static resources when for example serving If we did the latter, we'd be mapping We'd also have to ensure that the actual resources have different file names (likely but hard to control since we don't control DocC resource file names). However, we'd now be mixing the rather messy DocC proxy routes with our existing routes, creating a bigger mess. Figuring out if a Unless I'm missing another option I think we'd have to move our static resources to another base path if we wanted to avoid an additional anchor in our doc urls. Given that, I think I'd opt for ✅ http://localhost:8080/SwiftPackageIndex/SemanticVersion/_/documentation/semanticversion as the canonical doc url. Finally, there is some value in being able to tell from the url which part of the routing handles it. I.e. we'd know that any |
PR #2961 is in preparation for this change. I've run the following additional manual tests to ensure all doc urls keep working:
These files (attached below) can then be against DEV via
|
We currently host documentation on a variety of URLs:
[owner]/[repo]/main/documentation/package
[owner]/[repo]/0.1.0/documentation/package
[owner]/[repo]/0.1.0-pre1/documentation/package
and every time there is a new release, instead of redirecting to
/0.1.0/documentation/package
we now start redirecting to/0.2.0/documentation/package
. We always set the canonical URLs to the new version and update sitemaps to point at the new version, but this is causing a huge amount of churn in what we are asking Google to index and is contributing to our ongoing search index issues.We need to host documentation like this:
[owner]/[repo]/documentation/package
This is the canonical URL for a package's documentation, and is not a redirect. This should have a canonical URL meta tag and header pointing to itself and be marked for indexing.
[owner]/[repo]/[reference]/documentation/package
This is a canonical path to a reference specific documentation set, but should not be marked as canonical and should be excluded from Google indexing with a
noindex
tag and header. These pages should all point their canonical URL to the page above.Notes:
Steps
[owner]/[repo]/documentation/package
URL[owner]/[repo]/documentation/package
URLnoindex
via a meta tag and HTTP header on every documentation page apart from the canonical pageThe text was updated successfully, but these errors were encountered: