Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: actions/runner
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v2.313.0
Choose a base ref
...
head repository: actions/runner
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v2.314.0
Choose a head ref
  • 11 commits
  • 35 files changed
  • 7 contributors

Commits on Feb 7, 2024

  1. Prepare v2.313.0 Release (#3137)

    * update runnerversion
    
    * update releaseNote.md
    
    * update-releasenote
    luketomlinson authored Feb 7, 2024
    Copy the full SHA
    31318d8 View commit details

Commits on Feb 9, 2024

  1. Copy the full SHA
    7255957 View commit details

Commits on Feb 15, 2024

  1. Process snapshot tokens (#3135)

    * Added Snapshot TemplateToken to AgentJobRequestMessage
    
    * WIP for processing the snapshot token
    
    * Changed snapshot post job step condition to Success, added comments
    
    * Refactored snapshot post-job step
    
    * Added evaluation of snapshot token to retrieve image name
    
    * Added snapshot to workflow schema
    
    * Fixed linter error
    
    * Migrated snapshot logic to new SnapshotOperationProvider
    
    * Fixed linter error
    
    * Fixed linter errors
    
    * Fixed linter error
    
    * Fixed linter errors
    
    * Updated L0 tests
    
    * Fixed linter errors
    
    * Added new JobExtensionL0 tests for snapshot post-job step
    
    * Added JobExtensionL0 test case for snapshot mappings
    
    * Added SnapshotOperationProviderL0 tests
    
    * Enabled nullable types for SnapshotOperationProvider and its tests
    
    * Added more assertions to SnapshotOperationProviderL0 tests
    
    * Fixed linter errors
    
    * Made sure TestHostContexts are disposed of properlyh in SnapshotOperationProviderL0 tests
    
    * Resolved PR comments
    
    * Fixed formatting
    
    * Removed redundant reference
    
    * Addressed PR comments
    davidomid authored Feb 15, 2024
    Copy the full SHA
    927b26a View commit details

Commits on Feb 19, 2024

  1. Upgrade dotnet sdk to v6.0.419 (#3158)

    Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    github-actions[bot] authored Feb 19, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    3db5c90 View commit details

Commits on Feb 21, 2024

  1. handle broker run service exception handling (#3163)

    * handle run service exception handling
    
    * force fail always
    
    * format
    
    * format
    yaananth authored Feb 21, 2024
    Copy the full SHA
    b19b946 View commit details
  2. Add a retry logic to docker login operation (#3089)

    While there's an existing retry mechanism for the `docker pull` command
    [^1], it's missing for `docker login`.
    
    Similar to the `docker pull` scenario, the container registry could
    potentially be briefly unavailable or inaccessible, leading to failed
    `docker login` attempt and subsequent workflow run failures.
    
    Since it's container based workflow, there is not way to retry on
    customer side. The runner should retry itself.
    
    It also aligns with community feedback [^2].
    
    [^1]: https://github.com/actions/runner/blob/8e0cd36cd8c74c3067ffe10189c1e42f7e753af2/src/Runner.Worker/ContainerOperationProvider.cs#L201
    [^2]: https://github.com/orgs/community/discussions/73069
    
    Co-authored-by: Thomas Boop <52323235+thboop@users.noreply.github.com>
    enescakir and thboop authored Feb 21, 2024
    Copy the full SHA
    6603bfb View commit details
  3. Broker fixes for token refreshes and AccessDeniedException (#3161)

    luketomlinson authored Feb 21, 2024
    Copy the full SHA
    3449d5f View commit details
  4. Remove USE_BROKER_FLOW (#3162)

    luketomlinson authored Feb 21, 2024
    Copy the full SHA
    d296014 View commit details

Commits on Feb 26, 2024

  1. Refresh Token for BrokerServer (#3167)

    luketomlinson authored Feb 26, 2024
    Copy the full SHA
    034c51c View commit details
  2. Better step timeout message. (#3166)

    TingluoHuang authored Feb 26, 2024
    Copy the full SHA
    601d3de View commit details
  3. Prepare v2.314.0 release (#3172)

    * Prepare v2.314.0 release
    
    * update releaseNoteMd
    luketomlinson authored Feb 26, 2024
    Copy the full SHA
    d8bce88 View commit details
Showing with 516 additions and 173 deletions.
  1. +2 −2 .devcontainer/devcontainer.json
  2. +15 −21 releaseNote.md
  3. +31 −1 src/Runner.Common/BrokerServer.cs
  4. +2 −1 src/Runner.Common/RunServer.cs
  5. +63 −65 src/Runner.Listener/JobDispatcher.cs
  6. +13 −13 src/Runner.Listener/MessageListener.cs
  7. +1 −0 src/Runner.Listener/Runner.cs
  8. +0 −5 src/Runner.Sdk/Util/VssUtil.cs
  9. +29 −7 src/Runner.Worker/ContainerOperationProvider.cs
  10. +12 −0 src/Runner.Worker/JobExtension.cs
  11. +32 −0 src/Runner.Worker/SnapshotOperationProvider.cs
  12. +1 −1 src/Runner.Worker/StepsRunner.cs
  13. +1 −19 src/Sdk/Common/Common/VssHttpMessageHandler.cs
  14. +0 −10 src/Sdk/Common/Common/VssHttpRequestSettings.cs
  15. +9 −0 src/Sdk/DTPipelines/Pipelines/AgentJobRequestMessage.cs
  16. +2 −0 src/Sdk/DTPipelines/Pipelines/ObjectTemplating/PipelineTemplateConstants.cs
  17. +33 −0 src/Sdk/DTPipelines/Pipelines/ObjectTemplating/PipelineTemplateConverter.cs
  18. +26 −0 src/Sdk/DTPipelines/Pipelines/ObjectTemplating/PipelineTemplateEvaluator.cs
  19. +17 −0 src/Sdk/DTPipelines/Pipelines/Snapshot.cs
  20. +20 −1 src/Sdk/DTPipelines/workflow-v1.0.json
  21. +3 −0 src/Sdk/RSWebApi/Contracts/AcquireJobRequest.cs
  22. +2 −0 src/Sdk/RSWebApi/RunServiceHttpClient.cs
  23. +6 −1 src/Sdk/WebApi/WebApi/BrokerHttpClient.cs
  24. +3 −2 src/Test/L0/Listener/JobDispatcherL0.cs
  25. +1 −1 src/Test/L0/Listener/RunnerL0.cs
  26. +1 −1 src/Test/L0/Worker/ActionCommandManagerL0.cs
  27. +1 −1 src/Test/L0/Worker/CreateStepSummaryCommandL0.cs
  28. +16 −16 src/Test/L0/Worker/ExecutionContextL0.cs
  29. +91 −1 src/Test/L0/Worker/JobExtensionL0.cs
  30. +1 −0 src/Test/L0/Worker/JobRunnerL0.cs
  31. +78 −0 src/Test/L0/Worker/SnapshotOperationProviderL0.cs
  32. +1 −1 src/Test/L0/Worker/WorkerL0.cs
  33. +1 −1 src/dev.sh
  34. +1 −1 src/global.json
  35. +1 −1 src/runnerversion
4 changes: 2 additions & 2 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -4,13 +4,13 @@
"features": {
"ghcr.io/devcontainers/features/docker-in-docker:1": {},
"ghcr.io/devcontainers/features/dotnet": {
"version": "6.0.418"
"version": "6.0.419"
},
"ghcr.io/devcontainers/features/node:1": {
"version": "16"
},
"ghcr.io/devcontainers/features/sshd:1": {
"version": "latest"
"version": "latest"
}
},
"customizations": {
36 changes: 15 additions & 21 deletions releaseNote.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,23 @@
## What's Changed
* Fix `buildx` installation by @ajschmidt8 in https://github.com/actions/runner/pull/2952
* Create close-features and close-bugs bot for runner issues by @ruvceskistefan in https://github.com/actions/runner/pull/2909
* Send disableUpdate as query parameter by @luketomlinson in https://github.com/actions/runner/pull/2970
* Handle SelfUpdate Flow when Package is provided in Message by @luketomlinson in https://github.com/actions/runner/pull/2926
* Bump container hook version to 0.5.0 in runner image by @nikola-jokic in https://github.com/actions/runner/pull/3003
* Set `ImageOS` environment variable in runner images by @int128 in https://github.com/actions/runner/pull/2878
* Mark job as failed on worker crash. by @TingluoHuang in https://github.com/actions/runner/pull/3006
* Include whether http proxy configured as part of UserAgent. by @TingluoHuang in https://github.com/actions/runner/pull/3009
* Add codeload to the list of service we check during '--check'. by @TingluoHuang in https://github.com/actions/runner/pull/3011
* close reason update by @ruvceskistefan in https://github.com/actions/runner/pull/3027
* Update envlinux.md by @adjn in https://github.com/actions/runner/pull/3040
* Extend `--check` to check Results-Receiver service. by @TingluoHuang in https://github.com/actions/runner/pull/3078
* Use Azure SDK to upload files to Azure Blob by @yacaovsnc in https://github.com/actions/runner/pull/3033
* Remove code in runner for handling trimmed packages. by @TingluoHuang in https://github.com/actions/runner/pull/3074
* Update dotnet sdk to latest version @6.0.418 by @github-actions in https://github.com/actions/runner/pull/3085
* Patch Curl to no longer use -k by @thboop in https://github.com/actions/runner/pull/3091
* Prepare v2.313.0 Release by @luketomlinson in https://github.com/actions/runner/pull/3137
* Pass RunnerOS during job acquire. by @TingluoHuang in https://github.com/actions/runner/pull/3140
* Process `snapshot` tokens by @davidomid in https://github.com/actions/runner/pull/3135
* Update dotnet sdk to latest version @6.0.419 by @github-actions in https://github.com/actions/runner/pull/3158
* handle broker run service exception handling by @yaananth in https://github.com/actions/runner/pull/3163
* Add a retry logic to docker login operation by @enescakir in https://github.com/actions/runner/pull/3089
* Broker fixes for token refreshes and AccessDeniedException by @luketomlinson in https://github.com/actions/runner/pull/3161
* Remove USE_BROKER_FLOW by @luketomlinson in https://github.com/actions/runner/pull/3162
* Refresh Token for BrokerServer by @luketomlinson in https://github.com/actions/runner/pull/3167
* Better step timeout message. by @TingluoHuang in https://github.com/actions/runner/pull/3166

## New Contributors
* @int128 made their first contribution in https://github.com/actions/runner/pull/2878
* @adjn made their first contribution in https://github.com/actions/runner/pull/3040
* @davidomid made their first contribution in https://github.com/actions/runner/pull/3135
* @enescakir made their first contribution in https://github.com/actions/runner/pull/3089

**Full Changelog**: https://github.com/actions/runner/compare/v2.311.0...v2.312.0
**Full Changelog**: https://github.com/actions/runner/compare/v2.313.0...v2.314.0

_Note: Actions Runner follows a progressive release policy, so the latest release might not be available to your enterprise, organization, or repository yet.
To confirm which version of the Actions Runner you should expect, please view the download instructions for your enterprise, organization, or repository.
_Note: Actions Runner follows a progressive release policy, so the latest release might not be available to your enterprise, organization, or repository yet.
To confirm which version of the Actions Runner you should expect, please view the download instructions for your enterprise, organization, or repository.
See https://docs.github.com/en/enterprise-cloud@latest/actions/hosting-your-own-runners/adding-self-hosted-runners_

## Windows x64
32 changes: 31 additions & 1 deletion src/Runner.Common/BrokerServer.cs
Original file line number Diff line number Diff line change
@@ -21,6 +21,10 @@ public interface IBrokerServer : IRunnerService
Task DeleteSessionAsync(CancellationToken cancellationToken);

Task<TaskAgentMessage> GetRunnerMessageAsync(Guid? sessionId, TaskAgentStatus status, string version, string os, string architecture, bool disableUpdate, CancellationToken token);

Task UpdateConnectionIfNeeded(Uri serverUri, VssCredentials credentials);

Task ForceRefreshConnection(VssCredentials credentials);
}

public sealed class BrokerServer : RunnerService, IBrokerServer
@@ -59,7 +63,8 @@ public Task<TaskAgentMessage> GetRunnerMessageAsync(Guid? sessionId, TaskAgentSt
{
CheckConnection();
var brokerSession = RetryRequest<TaskAgentMessage>(
async () => await _brokerHttpClient.GetRunnerMessageAsync(sessionId, version, status, os, architecture, disableUpdate, cancellationToken), cancellationToken);
async () => await _brokerHttpClient.GetRunnerMessageAsync(sessionId, version, status, os, architecture, disableUpdate, cancellationToken), cancellationToken, shouldRetry: ShouldRetryException);


return brokerSession;
}
@@ -69,5 +74,30 @@ public async Task DeleteSessionAsync(CancellationToken cancellationToken)
CheckConnection();
await _brokerHttpClient.DeleteSessionAsync(cancellationToken);
}

public Task UpdateConnectionIfNeeded(Uri serverUri, VssCredentials credentials)
{
if (_brokerUri != serverUri || !_hasConnection)
{
return ConnectAsync(serverUri, credentials);
}

return Task.CompletedTask;
}

public Task ForceRefreshConnection(VssCredentials credentials)
{
return ConnectAsync(_brokerUri, credentials);
}

public bool ShouldRetryException(Exception ex)
{
if (ex is AccessDeniedException ade && ade.ErrorCode == 1)
{
return false;
}

return true;
}
}
}
3 changes: 2 additions & 1 deletion src/Runner.Common/RunServer.cs
Original file line number Diff line number Diff line change
@@ -5,6 +5,7 @@
using GitHub.Actions.RunService.WebApi;
using GitHub.DistributedTask.Pipelines;
using GitHub.DistributedTask.WebApi;
using GitHub.Runner.Common.Util;
using GitHub.Runner.Sdk;
using GitHub.Services.Common;
using Sdk.RSWebApi.Contracts;
@@ -60,7 +61,7 @@ public Task<AgentJobRequestMessage> GetJobMessageAsync(string id, CancellationTo
{
CheckConnection();
return RetryRequest<AgentJobRequestMessage>(
async () => await _runServiceHttpClient.GetJobMessageAsync(requestUri, id, cancellationToken), cancellationToken,
async () => await _runServiceHttpClient.GetJobMessageAsync(requestUri, id, VarUtil.OS, cancellationToken), cancellationToken,
shouldRetry: ex => ex is not TaskOrchestrationJobAlreadyAcquiredException);
}

128 changes: 63 additions & 65 deletions src/Runner.Listener/JobDispatcher.cs
Original file line number Diff line number Diff line change
@@ -35,7 +35,7 @@ public interface IJobDispatcher : IRunnerService
// This implementation of IJobDispatcher is not thread safe.
// It is based on the fact that the current design of the runner is a dequeue
// and processes one message from the message queue at a time.
// In addition, it only executes one job every time,
// In addition, it only executes one job every time,
// and the server will not send another job while this one is still running.
public sealed class JobDispatcher : RunnerService, IJobDispatcher
{
@@ -546,13 +546,27 @@ await processChannel.SendAsync(
Trace.Info($"Return code {returnCode} indicate worker encounter an unhandled exception or app crash, attach worker stdout/stderr to JobRequest result.");

var jobServer = await InitializeJobServerAsync(systemConnection);
await LogWorkerProcessUnhandledException(jobServer, message, detailInfo);

// Go ahead to finish the job with result 'Failed' if the STDERR from worker is System.IO.IOException, since it typically means we are running out of disk space.
if (detailInfo.Contains(typeof(System.IO.IOException).ToString(), StringComparison.OrdinalIgnoreCase))
var unhandledExceptionIssue = new Issue() { Type = IssueType.Error, Message = detailInfo };
unhandledExceptionIssue.Data[Constants.Runner.InternalTelemetryIssueDataKey] = Constants.Runner.WorkerCrash;
switch (jobServer)
{
Trace.Info($"Finish job with result 'Failed' due to IOException.");
await ForceFailJob(jobServer, message, detailInfo);
case IJobServer js:
{
await LogWorkerProcessUnhandledException(js, message, unhandledExceptionIssue);
// Go ahead to finish the job with result 'Failed' if the STDERR from worker is System.IO.IOException, since it typically means we are running out of disk space.
if (detailInfo.Contains(typeof(System.IO.IOException).ToString(), StringComparison.OrdinalIgnoreCase))
{
Trace.Info($"Finish job with result 'Failed' due to IOException.");
await ForceFailJob(js, message);
}

break;
}
case IRunServer rs:
await ForceFailJob(rs, message, unhandledExceptionIssue);
break;
default:
throw new NotSupportedException($"JobServer type '{jobServer.GetType().Name}' is not supported.");
}
}

@@ -644,7 +658,7 @@ await processChannel.SendAsync(
}
}

// wait worker to exit
// wait worker to exit
// if worker doesn't exit within timeout, then kill worker.
completedTask = await Task.WhenAny(workerProcessTask, Task.Delay(-1, workerCancelTimeoutKillToken));

@@ -1131,86 +1145,70 @@ private async Task CompleteJobRequestAsync(int poolId, Pipelines.AgentJobRequest
}

// log an error issue to job level timeline record
private async Task LogWorkerProcessUnhandledException(IRunnerService server, Pipelines.AgentJobRequestMessage message, string detailInfo)
private async Task LogWorkerProcessUnhandledException(IJobServer jobServer, Pipelines.AgentJobRequestMessage message, Issue issue)
{
if (server is IJobServer jobServer)
try
{
try
{
var timeline = await jobServer.GetTimelineAsync(message.Plan.ScopeIdentifier, message.Plan.PlanType, message.Plan.PlanId, message.Timeline.Id, CancellationToken.None);
ArgUtil.NotNull(timeline, nameof(timeline));
var timeline = await jobServer.GetTimelineAsync(message.Plan.ScopeIdentifier, message.Plan.PlanType, message.Plan.PlanId, message.Timeline.Id, CancellationToken.None);
ArgUtil.NotNull(timeline, nameof(timeline));

TimelineRecord jobRecord = timeline.Records.FirstOrDefault(x => x.Id == message.JobId && x.RecordType == "Job");
ArgUtil.NotNull(jobRecord, nameof(jobRecord));
TimelineRecord jobRecord = timeline.Records.FirstOrDefault(x => x.Id == message.JobId && x.RecordType == "Job");
ArgUtil.NotNull(jobRecord, nameof(jobRecord));

var unhandledExceptionIssue = new Issue() { Type = IssueType.Error, Message = detailInfo };
unhandledExceptionIssue.Data[Constants.Runner.InternalTelemetryIssueDataKey] = Constants.Runner.WorkerCrash;
jobRecord.ErrorCount++;
jobRecord.Issues.Add(unhandledExceptionIssue);

if (message.Variables.TryGetValue("DistributedTask.MarkJobAsFailedOnWorkerCrash", out var markJobAsFailedOnWorkerCrash) &&
StringUtil.ConvertToBoolean(markJobAsFailedOnWorkerCrash?.Value))
{
Trace.Info("Mark the job as failed since the worker crashed");
jobRecord.Result = TaskResult.Failed;
// mark the job as completed so service will pickup the result
jobRecord.State = TimelineRecordState.Completed;
}
jobRecord.ErrorCount++;
jobRecord.Issues.Add(issue);

await jobServer.UpdateTimelineRecordsAsync(message.Plan.ScopeIdentifier, message.Plan.PlanType, message.Plan.PlanId, message.Timeline.Id, new TimelineRecord[] { jobRecord }, CancellationToken.None);
}
catch (Exception ex)
if (message.Variables.TryGetValue("DistributedTask.MarkJobAsFailedOnWorkerCrash", out var markJobAsFailedOnWorkerCrash) &&
StringUtil.ConvertToBoolean(markJobAsFailedOnWorkerCrash?.Value))
{
Trace.Error("Fail to report unhandled exception from Runner.Worker process");
Trace.Error(ex);
Trace.Info("Mark the job as failed since the worker crashed");
jobRecord.Result = TaskResult.Failed;
// mark the job as completed so service will pickup the result
jobRecord.State = TimelineRecordState.Completed;
}

await jobServer.UpdateTimelineRecordsAsync(message.Plan.ScopeIdentifier, message.Plan.PlanType, message.Plan.PlanId, message.Timeline.Id, new TimelineRecord[] { jobRecord }, CancellationToken.None);
}
else
catch (Exception ex)
{
Trace.Info("Job server does not support handling unhandled exception yet, error message: {0}", detailInfo);
return;
Trace.Error("Fail to report unhandled exception from Runner.Worker process");
Trace.Error(ex);
}
}

// raise job completed event to fail the job.
private async Task ForceFailJob(IRunnerService server, Pipelines.AgentJobRequestMessage message, string detailInfo)
private async Task ForceFailJob(IJobServer jobServer, Pipelines.AgentJobRequestMessage message)
{
if (server is IJobServer jobServer)
try
{
try
{
var jobCompletedEvent = new JobCompletedEvent(message.RequestId, message.JobId, TaskResult.Failed);
await jobServer.RaisePlanEventAsync<JobCompletedEvent>(message.Plan.ScopeIdentifier, message.Plan.PlanType, message.Plan.PlanId, jobCompletedEvent, CancellationToken.None);
}
catch (Exception ex)
{
Trace.Error("Fail to raise JobCompletedEvent back to service.");
Trace.Error(ex);
}
var jobCompletedEvent = new JobCompletedEvent(message.RequestId, message.JobId, TaskResult.Failed);
await jobServer.RaisePlanEventAsync<JobCompletedEvent>(message.Plan.ScopeIdentifier, message.Plan.PlanType, message.Plan.PlanId, jobCompletedEvent, CancellationToken.None);
}
else if (server is IRunServer runServer)
catch (Exception ex)
{
try
{
var unhandledExceptionIssue = new Issue() { Type = IssueType.Error, Message = detailInfo };
var unhandledAnnotation = unhandledExceptionIssue.ToAnnotation();
var jobAnnotations = new List<Annotation>();
if (unhandledAnnotation.HasValue)
{
jobAnnotations.Add(unhandledAnnotation.Value);
}
Trace.Error("Fail to raise JobCompletedEvent back to service.");
Trace.Error(ex);
}
}

await runServer.CompleteJobAsync(message.Plan.PlanId, message.JobId, TaskResult.Failed, outputs: null, stepResults: null, jobAnnotations: jobAnnotations, environmentUrl: null, CancellationToken.None);
}
catch (Exception ex)
private async Task ForceFailJob(IRunServer runServer, Pipelines.AgentJobRequestMessage message, Issue issue)
{
try
{
var annotation = issue.ToAnnotation();
var jobAnnotations = new List<Annotation>();
if (annotation.HasValue)
{
Trace.Error("Fail to raise job completion back to service.");
Trace.Error(ex);
jobAnnotations.Add(annotation.Value);
}

await runServer.CompleteJobAsync(message.Plan.PlanId, message.JobId, TaskResult.Failed, outputs: null, stepResults: null, jobAnnotations: jobAnnotations, environmentUrl: null, CancellationToken.None);
}
else
catch (Exception ex)
{
throw new NotSupportedException($"Server type {server.GetType().FullName} is not supported.");
Trace.Error("Fail to raise job completion back to service.");
Trace.Error(ex);
}
}

Loading