Skip to content

Conversation

@michelle0927
Copy link
Collaborator

@michelle0927 michelle0927 commented Dec 17, 2025

Applies changes from #19525

Note: Will create follow-up PR to update apify_oauth components accordingly after this is published.

Summary by CodeRabbit

  • New Features

    • Added "Get Key-Value Store Record" action to retrieve records from key-value stores.
    • Added "Run Task" action with webhook support and async polling capabilities.
    • Enhanced actor selection with dynamic source filtering (recently-used or store).
  • Bug Fixes

    • Improved JSON parsing and schema-based editor handling in Run Actor action.
    • Fixed null value storage in Set Key-Value Store Record.
  • Documentation

    • Updated action descriptions and field labels for clarity.

✏️ Tip: You can customize this high-level summary in your review settings.

@vercel
Copy link

vercel bot commented Dec 17, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Review Updated (UTC)
pipedream-docs-redirect-do-not-edit Ignored Ignored Dec 17, 2025 7:31pm

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 17, 2025

Walkthrough

This PR extends the Apify integration with a new Key-Value Store record retrieval action, a task runner with webhook-based polling and rerun support, API method additions and signature changes in the Apify client, refined logic in the actor runner, and corresponding version bumps across multiple action modules and the package.

Changes

Cohort / File(s) Summary
New Key-Value Store Action
components/apify/actions/get-kvs-record/get-kvs-record.mjs
Introduces read-only KVS record retrieval action that determines content type and size via HEAD request, returns parsed JSON/text directly if below 10 MB threshold, or returns signed URL file pointer for larger/binary content. Handles 404 as "record not found".
New Task Runner
components/apify/actions/run-task/run-task.mjs
Introduces comprehensive task runner with three execution paths: immediate start (non-blocking), webhook-based polling with 1-day window and 30-second intervals, and rerun/resume with webhook cleanup and terminal status handling. Includes optional input override and memory/build parameters.
Actor Runner Logic
components/apify/actions/run-actor/run-actor.mjs
Refactors prepareData() to iterate schema properties instead of input keys; extends setValue() to handle json/schemaBased editors with JSON parsing; refines additionalProps() with selective boolean initialization, empty value filtering, and JSON.stringify handling for defaults; updates raw input selection logic. Version 0.0.6 → 0.0.7.
Apify Client API
components/apify/apify.app.mjs
Adds public methods getActorOptions(), getAuthToken(), getKVSRecord(), getKVSRecordUrl(); renames getActorRun()getRun(); updates runTask() and runTaskSynchronously() to accept input parameter; refactors actor/task/KVS option generation into reusable helpers; removes my: true flag from actor listing; updates descriptions and labels for consistency.
Version Bumps (Actions)
components/apify/actions/get-dataset-items/get-dataset-items.mjs, components/apify/actions/run-task-synchronously/run-task-synchronously.mjs, components/apify/actions/scrape-single-url/scrape-single-url.mjs
Metadata version increments: 0.0.5 → 0.0.6, 0.0.5 → 0.0.6, 0.0.1.2 → 0.0.1.3 respectively. No functional changes.
Set KVS Record
components/apify/actions/set-key-value-store-record/set-key-value-store-record.mjs
Adds unnamed: false constraint to keyValueStoreId propDefinition; adjusts inferFromValue() to remove null from JSON-path handling, storing null as text. Version 0.2.2 → 0.2.3.
Dynamic Actor/Task Selection (Sources)
components/apify/sources/new-finished-actor-run-instant/new-finished-actor-run-instant.mjs
Replaces static actorId propDefinition with dynamic additionalProps() method that resolves actorId options via apify.getActorOptions() based on new actorSource prop (store or recently-used, default recently-used). Version 0.0.5 → 0.0.6.
Version Bumps (Sources & Package)
components/apify/sources/new-finished-task-run-instant/new-finished-task-run-instant.mjs, components/apify/package.json
Source version 0.0.5 → 0.0.6 (no behavior change); package version 0.3.1 → 0.4.0.

Sequence Diagram

sequenceDiagram
    actor User
    participant Client
    participant Apify as Apify API
    participant Webhook as Webhook Handler
    participant Storage as Context Storage

    User->>Client: Trigger run-task with waitForFinish=true
    Client->>Apify: startTask(taskId, input, params)
    Apify-->>Client: runId, status=RUNNING
    Client->>Storage: Store runId, createdAt, webhookUrl
    Client->>Apify: createWebhook(taskRunFinished events)
    Apify-->>Client: webhookId
    Client->>Storage: Store webhookId
    Client->>Client: Suspend (await webhook or 1-day timeout)
    
    Note over Webhook,Apify: Task completes on Apify side
    Apify->>Webhook: POST task completion event
    Webhook->>Client: Resume run with webhook event
    
    Client->>Apify: getRun(runId)
    Apify-->>Client: Run status (succeeded/failed)
    alt Terminal Status
        Client->>Apify: deleteWebhook(webhookId)
        Client-->>User: Return final status/result
    else Polling Window Active
        Client->>Storage: Schedule polling rerun (30s interval)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • run-task.mjs (new): Complex webhook-based polling logic with state persistence, multiple error paths, and rerun/resume flow requires careful review of timing guarantees and webhook cleanup.
  • apify.app.mjs: Significant refactoring with method renames (getActorRungetRun), signature changes (input parameter additions), removed filtering flags (my: true), and new helper methods that affect multiple dependent actions; verify backward compatibility and option generation logic.
  • run-actor.mjs: Dense logic updates in prepareData(), setValue(), and additionalProps() with schema-based transformations and conditional JSON parsing; requires line-by-line verification of property iteration and default handling.
  • new-finished-actor-run-instant.mjs: Dynamic prop composition via additionalProps() with conditional actorId resolution; verify that getActorOptions() integration works correctly with the new actorSource parameter.

Possibly related PRs

Suggested labels

User submitted

Suggested reviewers

  • lcaresia
  • GTFalcao

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete and does not follow the required template, which specifies a 'WHY' section that should explain the rationale for the changes. Complete the description by filling in the 'WHY' section to explain the business rationale, use cases, or problems being addressed by these Apify component updates.
Title check ❓ Inconclusive The title 'Updating Apify components' is vague and overly generic, failing to convey specific information about the substantial changes being made to the Apify integration. Consider a more descriptive title that highlights key changes, such as 'Add KVS record retrieval and task running actions to Apify integration' or 'Enhance Apify actions with new KVS and task run capabilities'.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch apify-changes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e27319f and 6de4c13.

📒 Files selected for processing (11)
  • components/apify/actions/get-dataset-items/get-dataset-items.mjs (1 hunks)
  • components/apify/actions/get-kvs-record/get-kvs-record.mjs (1 hunks)
  • components/apify/actions/run-actor/run-actor.mjs (5 hunks)
  • components/apify/actions/run-task-synchronously/run-task-synchronously.mjs (1 hunks)
  • components/apify/actions/run-task/run-task.mjs (1 hunks)
  • components/apify/actions/scrape-single-url/scrape-single-url.mjs (1 hunks)
  • components/apify/actions/set-key-value-store-record/set-key-value-store-record.mjs (3 hunks)
  • components/apify/apify.app.mjs (6 hunks)
  • components/apify/package.json (1 hunks)
  • components/apify/sources/new-finished-actor-run-instant/new-finished-actor-run-instant.mjs (1 hunks)
  • components/apify/sources/new-finished-task-run-instant/new-finished-task-run-instant.mjs (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-12T07:58:39.628Z
Learnt from: matyascimbulka
Repo: PipedreamHQ/pipedream PR: 18308
File: components/apify/actions/run-task-synchronously/run-task-synchronously.mjs:95-0
Timestamp: 2025-09-12T07:58:39.628Z
Learning: The Apify ActorRun object always contains the defaultDatasetId property according to the official documentation, so conditional checks for its existence are not needed when calling listDatasetItems.

Applied to files:

  • components/apify/actions/run-task-synchronously/run-task-synchronously.mjs
  • components/apify/actions/get-dataset-items/get-dataset-items.mjs
  • components/apify/actions/run-actor/run-actor.mjs
📚 Learning: 2025-09-12T08:28:06.736Z
Learnt from: matyascimbulka
Repo: PipedreamHQ/pipedream PR: 18308
File: components/apify/sources/common/base.mjs:17-0
Timestamp: 2025-09-12T08:28:06.736Z
Learning: WEBHOOK_EVENT_TYPE_GROUPS.ACTOR_RUN_TERMINAL from apify/consts is an array containing all terminal Actor run event types: ["ACTOR.RUN.SUCCEEDED", "ACTOR.RUN.FAILED", "ACTOR.RUN.ABORTED", "ACTOR.RUN.TIMED_OUT"]. It should be used directly in the eventTypes field when creating webhooks for future-proofing.

Applied to files:

  • components/apify/actions/run-task/run-task.mjs
  • components/apify/actions/run-actor/run-actor.mjs
  • components/apify/sources/new-finished-actor-run-instant/new-finished-actor-run-instant.mjs
🧬 Code graph analysis (3)
components/apify/actions/set-key-value-store-record/set-key-value-store-record.mjs (2)
components/apify/actions/run-actor/run-actor.mjs (1)
  • input (319-319)
components/apify/actions/run-task/run-task.mjs (1)
  • input (68-68)
components/apify/actions/get-kvs-record/get-kvs-record.mjs (2)
components/apify/actions/run-task/run-task.mjs (1)
  • $ (204-204)
components/apify/actions/run-actor/run-actor.mjs (1)
  • record (371-374)
components/apify/sources/new-finished-actor-run-instant/new-finished-actor-run-instant.mjs (1)
components/apify/actions/run-actor/run-actor.mjs (1)
  • props (187-187)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Lint Code Base
  • GitHub Check: Verify TypeScript components
  • GitHub Check: Publish TypeScript components
  • GitHub Check: pnpm publish
🔇 Additional comments (32)
components/apify/package.json (1)

3-3: LGTM!

Version bump from 0.3.1 to 0.4.0 is appropriate for the new features and actions being added in this PR.

components/apify/actions/get-dataset-items/get-dataset-items.mjs (1)

8-8: LGTM!

Version bump aligns with the package version update. The pagination logic correctly handles limits and offsets.

components/apify/actions/scrape-single-url/scrape-single-url.mjs (1)

9-9: LGTM!

Version bump is consistent with the package-wide updates.

components/apify/sources/new-finished-task-run-instant/new-finished-task-run-instant.mjs (1)

9-9: LGTM!

Version bump is consistent with the package-wide updates.

components/apify/actions/run-task-synchronously/run-task-synchronously.mjs (1)

8-8: LGTM!

Version bump is consistent with the package-wide updates. The implementation correctly uses defaultDatasetId directly without conditional checks, as per the Apify documentation.

components/apify/actions/set-key-value-store-record/set-key-value-store-record.mjs (3)

6-7: LGTM!

Version bump and description update appropriately reflect the action's capabilities.


20-22: LGTM!

Adding unnamed: false to the prop definition appropriately filters out unnamed Key-Value Stores, ensuring users can only select named stores for this action.


43-55: LGTM!

The type inference logic correctly handles all JavaScript primitive types and objects:

  • null, numbers, booleans, arrays, and objects are stored as JSON
  • Strings are evaluated separately for potential JSON parsing
components/apify/actions/run-actor/run-actor.mjs (6)

10-10: LGTM!

Version bump reflects the significant logic changes in this action.


140-152: LGTM!

Iterating over schema properties instead of input data keys ensures only valid schema-defined properties are processed, and skipping undefined values prevents sending empty fields to the API.


218-218: LGTM!

Filtering out empty option values and labels prevents confusing or broken dropdown selections in the UI.


220-226: LGTM!

Using prefill as the default value aligns with Apify's input schema specification, where prefill serves as a suggestion while the actual default is handled by the platform.


228-236: LGTM!

Correctly handles string[] props with special editors by transforming default values appropriately - extracting URLs for requestListSources and stringifying objects for json/schemaBased editors.


312-319: LGTM!

The input selection logic correctly prioritizes dynamic schema props (data) over the fallback properties object prop, with a clean ternary chain.

components/apify/sources/new-finished-actor-run-instant/new-finished-actor-run-instant.mjs (3)

3-3: LGTM: Import and prop addition for dynamic actor selection.

The apify module is correctly imported and added as a prop, enabling dynamic actor option generation via getActorOptions.

Also applies to: 16-16


17-33: LGTM: Well-structured dynamic prop for actor source selection.

The actorSource prop with reloadProps: true correctly triggers additionalProps() re-evaluation when the user changes the selection between "store" and "recently-used" actors.


35-51: LGTM: Dynamic actorId generation via additionalProps.

The pattern correctly spreads the base propDefinition and provides a custom options() function that delegates to this.apify.getActorOptions(). This aligns with the helper method added in apify.app.mjs.

components/apify/apify.app.mjs (5)

166-168: LGTM: Auth token getter for API calls.

Exposing the auth token via getAuthToken() is necessary for the direct HTTP calls in the KVS record action.


193-198: LGTM: runTask now accepts input parameter.

The method signature update to pass input to start() aligns with the Apify client API and enables input override functionality in the run-task action.


262-269: LGTM: KVS record access methods.

getKVSRecord and getKVSRecordUrl correctly delegate to the Apify client's key-value store API, enabling the new get-kvs-record action.


270-275: LGTM: runTaskSynchronously now accepts input.

Consistent with runTask changes, the synchronous variant now also passes input to call().


183-186: Method rename successfully completed. No references to the old getActorRun method name remain in the codebase. The new getRun method is properly being called in components/apify/actions/run-task/run-task.mjs.

components/apify/actions/get-kvs-record/get-kvs-record.mjs (4)

1-14: LGTM: Well-annotated action metadata.

The action metadata correctly indicates this is a read-only, non-destructive operation with proper documentation link.


37-52: LGTM: Robust HEAD request handling.

Using validateStatus: () => true to handle all status codes manually is appropriate here, with proper 404 and error handling.


69-77: LGTM: Proper type discrimination for JSON vs text.

The check typeof data === "object" && !Array.isArray(data) && data !== null correctly identifies plain JSON objects, returning them directly while wrapping primitives/arrays in a { value } structure.


78-87: LGTM: Signed URL fallback for large/binary files.

Returning a signed URL for files that can't be parsed inline is a sensible approach that avoids memory issues with large files.

components/apify/actions/run-task/run-task.mjs (6)

65-76: LGTM: Input parsing with clear error message.

The JSON parsing with descriptive error handling is appropriate for user-provided input.


91-100: LGTM: Non-fatal webhook deletion.

Using console.warn for webhook deletion failures is appropriate since cleanup failure shouldn't block the action result.


102-115: LGTM: Polling state persistence across reruns.

The schedulePoll helper correctly persists apifyRunId, pollStartMs, and webhookId via $.flow.rerun() context.


153-162: LGTM: Polling window enforcement with cleanup.

Properly enforces the 1-day polling limit and cleans up the webhook before throwing the timeout error.


170-193: LGTM: Terminal status handling.

The logic correctly handles all terminal statuses, cleans up the webhook, and differentiates between success and failure outcomes.


196-241: LGTM: Initial execution with webhook and fallback polling.

The flow correctly:

  1. Starts the task and stores context
  2. Creates a suspend point with resume URL
  3. Sets up a webhook for terminal events
  4. Schedules fallback polling via rerun

This provides reliable completion detection with webhook as primary and polling as fallback.

@michelle0927 michelle0927 moved this from Doing to Ready for QA in Component (Source and Action) Backlog Dec 17, 2025
@vunguyenhung
Copy link
Collaborator

@vunguyenhung vunguyenhung moved this from Ready for QA to Ready for Release in Component (Source and Action) Backlog Dec 18, 2025
@vunguyenhung
Copy link
Collaborator

Hi everyone, all test cases are passed! Ready for release!

Test reports

Copy link
Collaborator

@sergio-eliot-rodriguez sergio-eliot-rodriguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great!

@sergio-eliot-rodriguez sergio-eliot-rodriguez merged commit 67eaf60 into master Dec 18, 2025
9 checks passed
@sergio-eliot-rodriguez sergio-eliot-rodriguez deleted the apify-changes branch December 18, 2025 15:43
@github-project-automation github-project-automation bot moved this from Ready for Release to Done in Component (Source and Action) Backlog Dec 18, 2025
michelle0927 added a commit that referenced this pull request Dec 19, 2025
* new actions, updates

* pnpm-lock.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants