Skip to content

Worker Input Configuration (Input Schema)

This document explains how developers should configure the input_schema.json file. This file determines the input form layout presented to users in the web interface of an automation Worker.

input_schema.json is the “face” of your script. By editing this file, you can control which parameters users must fill in before launching the script, such as URLs, keywords, dates, and more, as well as how those fields are displayed, such as dropdowns, checkboxes, and text inputs.

A standard configuration file commonly contains these top-level fields:

  1. description: Introduces the script’s purpose and usage to the user.
  2. concurrency: Defines how the platform splits one Worker run into tasks.
  3. properties: The list of specific parameter settings.
{
"description": "With our Instagram Reel information Worker tool, after a successful scrape, you can extract the Reel author's username, Reel caption, hashtags used in the post, number of comments on the Reel, Reel publish date, likes count, views count, play count, popular comments, unique post identifier, URL of the Reel's display image or video thumbnail, product type, Reel duration, video URL, post audio link, number of posts on the profile, number of followers on the profile, profile URL, whether the account is a paid partner, and other relevant information.",
"concurrency": {
"fields": ["startUrl"]
},
"properties": [
{
"title": "URL",
"name": "startUrl",
"type": "array",
"editor": "requestList",
"description": "This parameter is used to specify the Instagram access URL to be fetched.",
"default": [
{
"url": "https://www.instagram.com/reel/C5Rdyj_q7YN/"
}
],
"required": true
}
]
}
Field NameRequiredDescription
descriptionNoTool summary. Displayed at the top of the page. You can use it to describe the script’s purpose, notes, and more. There is no length limit.
concurrencyNoTask splitting configuration. Configuration for splitting one run into multiple tasks. It contains fields and optional remove_fields.
propertiesYesParameter configuration array. This contains all input items, and each element represents one input field or selector on the page.

CoreClaw decides how to split a submitted run by checking the schema in this order:

  1. If concurrency.fields contains at least one non-empty field name, the platform uses the concurrency rules.
  2. If concurrency.fields is empty or missing, the whole submitted input becomes one task.
FieldTypeDescription
fieldsstring[]Candidate input fields used for task splitting. Each field should match a properties[*].name whose type is array.
remove_fieldsstring[]Optional fields to remove from task input when a preferred field has values. Each value should also appear in fields.

The platform first calculates:

preferred = fields - remove_fields

Then it chooses active fields:

  • If preferred has any field with a non-empty value in the submitted input, only preferred fields are active.
  • Otherwise, all fields are active, including fields listed in remove_fields.

Before splitting tasks, the platform filters empty concurrency items. The following values are treated as empty:

  • null
  • Empty or whitespace-only strings, such as "" or " "
  • Empty objects, such as {}
  • Objects where every value is empty, such as { "place_id": "" } or { "foo": null, "bar": "" }

If a concurrency array becomes empty after filtering, that field is treated as having no value.

For every generated task:

  • The field currently being split is kept with one item.
  • Other active concurrency fields are kept as [""].
  • Fields disabled through remove_fields are removed from the task input entirely.
  • Non-concurrency fields are copied into every generated task.

Example with keyword fallback:

{
"concurrency": {
"fields": ["keywords", "google_maps_urls", "place_ids"],
"remove_fields": ["keywords"]
},
"properties": [
{
"title": "Keywords",
"name": "keywords",
"type": "array",
"editor": "stringList",
"description": "Search keywords",
"required": false
},
{
"title": "Google Maps URLs",
"name": "google_maps_urls",
"type": "array",
"editor": "requestList",
"description": "Google Maps URLs",
"required": false
},
{
"title": "Place IDs",
"name": "place_ids",
"type": "array",
"editor": "stringList",
"description": "Google Maps place IDs",
"required": false
}
]
}

If the submitted input is:

{
"keywords": ["pizza", "iphone"],
"google_maps_urls": ["urlA", "urlB"],
"place_ids": [],
"base_location": "New York, USA"
}

The platform generates two tasks:

{
"google_maps_urls": ["urlA"],
"place_ids": [""],
"base_location": "New York, USA"
}
{
"google_maps_urls": ["urlB"],
"place_ids": [""],
"base_location": "New York, USA"
}

keywords is removed because google_maps_urls has values and keywords is listed in remove_fields.

If google_maps_urls and place_ids are both empty, the platform falls back to keywords and keeps the other concurrency fields as [""].

CaseResult
remove_fields is omittedEvery non-empty field in fields participates in splitting. Task count is the sum of valid items across those fields.
fields contains one fieldThe run splits by that single array field.
A remove_fields field is disabledThe key is removed from task input entirely. It is not kept as [""].
A preferred field contains only "", null, {}, or objects with only empty valuesIt is treated as empty and does not trigger remove_fields.
A URL contains &The platform keeps the URL value as submitted. Avoid re-serializing it in Worker code in a way that HTML-escapes &.
A concurrency array contains very large integersThe platform preserves them during JSON parsing, but strings are still safer if another language or service will read the value.
The generated task count exceeds the limitThe platform counts tasks first and rejects the run before expanding all task payloads. Avoid submitting very large arrays.

Every item inside custom[fieldName] follows the same rules:

Item typeExampleSupportedBehavior
Object{ "url": "https://a.com" }YesMerged into the task input instead of staying under the concurrency field name; child values override parent values.
String"pizza"YesWrapped as ["pizza"]. Whitespace-only strings are filtered as empty.
Number42, 3.14YesWrapped as [42] or [3.14]. Very large integers are preserved by the platform parser, but strings are still safer across languages.
BooleantrueYesWrapped as [true].
nullnullTreated as emptyFiltered before splitting.
Nested array["first", "second"] as one itemNoCauses a runtime error.
Mixed object and primitive items[{ "url": "a" }, "x"]NoCauses a runtime error. Use one item shape per field.
Error messageCauseFix
input_schema is not a valid jsonThe schema file is not valid JSON.Validate the file before upload.
custom parameters must contain a single JSON objectSubmitted input is not one top-level object.Send a single JSON object as input.
concurrency fields must have at least one fieldconcurrency.fields contains no valid field names.Add at least one field name.
concurrency fields have no non-empty fieldsAll configured concurrency fields are empty after filtering.Submit at least one non-empty value.
field [X] must be an arrayA concurrency field exists but is not an array.Send an array value.
item at index N in [X] must be an object or primitive valueA concurrency item is a nested array or unsupported type.Use object or primitive items.
field [X] must not mix object and primitive itemsOne array mixes object items and primitive items.Use a consistent item type.
concurrency_num (N) exceeds limit (M)Generated task count exceeds the platform limit.Reduce input size or adjust platform limits.
  • Use concurrency.fields for task splitting.
  • Make every concurrency field match a properties[*].name with type: "array".
  • If you use remove_fields, keep it as a subset of fields.
  • Do not rely on remove_fields keys being present in task input; they can be removed entirely.
  • Do not mix object items and primitive items inside the same concurrency array.
  • Do not use nested arrays as concurrency items.

Each input item must be an object. In normal Worker schemas, each property should contain these fields:

FieldTypeRequiredDescription
titlestringYesField label shown in the form.
namestringYesInternal field name used by Worker code. It must be unique and should match ^[A-Za-z_][A-Za-z0-9_]*$.
typestringYesData type. See the type table below.
editorstringYesForm control used in the web interface. See the editor table below.
descriptionstringYesHelper text shown below the field.
requiredbooleanYesIf true, the Worker cannot start until the user fills in this field.
defaultSame as typeNoInitial value shown in the form. It should match type.
optionsarrayNoOption list for checkbox, select, or radio.
TypeMeaningTypical defaultCommon editors
stringText"abc"input, textarea, select
integerInteger42number, input
numberFloating-point number3.14number
booleanBooleantrue / falseswitch
arrayList[] / [...]checkbox, stringList, requestList
objectObject{}Rarely used directly
EditorRecommended typeUse case
inputstring, integer, numberSingle-line text or simple numeric input
textareastringMulti-line text
numberinteger, numberNumeric input
switchbooleanOn/off setting
checkboxarrayMultiple choices
selectstring, integerSingle-select dropdown
radiostring, integerSingle-choice radio group
stringListarrayList of strings
requestListarrayList of URL or request objects

You can choose different editor types based on your needs to improve the user experience.

Type ValueUI FormCommon Use Cases
inputSingle-line inputShort text, keywords, account names
textareaMulti-line textboxNotes, long text descriptions
numberNumber inputLimits, page numbers, wait seconds
Type ValueUI FormExample Use Cases
selectDropdownGender, language, region
radioRadio groupOne-of-two or one-of-three choices
checkboxCheckbox setSelect multiple tags of interest
switchToggle switchEnable or disable an option
Type ValueUI FormCommon Use Cases
datepickerDate pickerFilter by a specific publish date
requestListURL listBatch input for page links to scrape, with Excel-style import support
requestListSourceURL request sourceAllows additional custom parameters
stringListString listBatch input for multiple keywords

{
"title": "Location (use only one location per run)",
"name": "location",
"type": "string",
"editor": "input",
"default": "New York, USA"
}
{
"title": "Filter reviews by keywords",
"name": "keywords",
"type": "string",
"editor": "textarea"
}
{
"title": "Number of places to extract (per each search term or URL)",
"name": "maxPlacesPerSearch",
"type": "integer",
"editor": "number",
"default": 4
}

Set multiple: true attribute for select to enable multiple mode.

{
"title": "Language",
"name": "language",
"type": "string",
"editor": "select",
"options": [
{
"label": "English",
"value": "en"
},
{
"label": "Chinese",
"value": "zh"
}
],
"default": "en"
}
{
"title": "Category",
"name": "radio",
"type": "integer",
"editor": "radio",
"options": [
{
"label": "hotel",
"value": 1
},
{
"label": "restaurant",
"value": 2
}
],
"default": 1
}
{
"title": "Data Sections to Scrape",
"name": "data_sections",
"type": "array",
"editor": "checkbox",
"options": [
{
"label": "Reviews",
"value": "reviews"
},
{
"label": "Address",
"value": "address"
},
{
"label": "Phone Number",
"value": "phone_number"
}
],
"default": ["reviews", "address"]
}
{
"title": "Extract posts that are newer than",
"name": "date",
"type": "string",
"editor": "datepicker",
"format": "DD/MM/YYYY",
"valueFormat": "DD/MM/YYYY"
}
{
"title": "Skip closed places",
"name": "skipClosed",
"type": "boolean",
"editor": "switch"
}

Object array with custom key names:

{
"name": "startURLs",
"type": "array",
"title": "Start URLs",
"editor": "requestList",
"default": [
{
"key": "value1"
},
{
"key": "value2"
}
],
"required": true,
"description": "The URLs of the website to scrape"
}

OR plain string array:

{
"name": "startURLs",
"type": "array",
"title": "Start URLs",
"editor": "requestList",
"default": [
"value1",
"value2"
],
"required": true,
"description": "The URLs of the website to scrape"
}

10. URL Request Source (requestListSource)

Section titled “10. URL Request Source (requestListSource)”

Similar to requestList, but allows you to define additional custom parameters for each URL entry via param_list.

{
"name": "url",
"type": "array",
"title": "startURLs",
"editor": "requestListSource",
"default": [
{
"url": "https://www.instagram.com/espn",
"num_of_posts": "10"
}
],
"param_list": [
{
"param": "url",
"title": "URL",
"required": true,
"description": "The URL to scrape"
},
{
"param": "num_of_posts",
"title": "Maximum Posts",
"description": "Maximum number of posts to fetch"
}
],
"description": "The URLs of the website to scrape"
}

Object array with custom key names:

{
"title": "Search term(s)",
"name": "searchTerms",
"type": "array",
"editor": "stringList",
"default": [
{
"key": "value1"
},
{
"key": "value2"
}
]
}

OR plain string array:

{
"title": "Search term(s)",
"name": "searchTerms",
"type": "array",
"editor": "stringList",
"default": [
"value1",
"value2"
]
}

Developers can logically group multiple configuration items by using specific fields. When there are many configuration items, grouping improves readability and maintainability, helping users locate and understand the settings more easily.

ParameterExample ValueTypeRequiredDescription
sectionCaption-StringNoDefines the display title of a group. When this property appears in a configuration item, it is treated as the start of a new group.
sectionDescription-StringNoAdds extra explanation for the current group and provides more detailed context.
{
"description": "Find usernames across 400+ social networks. Check if a username is available or already taken on various platforms.",
"concurrency": {
"fields": ["username"]
},
"properties": [
{
"title": "Username",
"name": "username",
"type": "array",
"editor": "stringList",
"description": "Username(s) to search. One per line.",
"default": [
"john_doe"
],
},
{
"title": "Timeout (secs)",
"name": "timeout",
"type": "integer",
"editor": "number",
"description": "Request timeout in seconds per site.",
"default": 30,
"sectionCaption": "Request control and result settings",
"sectionDescription": "Configure the performance parameters of the crawler requests."
}
]
}
  1. Write clear descriptions: Make sure description is clear and accurate. This helps your script get discovered by more target users.
  2. Set sensible defaults: A reasonable default value lets users run the script immediately and greatly lowers the barrier to entry.
  3. Validate required fields: For parameters without which the script cannot run, such as login cookies or the main URL, be sure to set "required": true.
  4. Configure task splitting deliberately: Use concurrency.fields for array inputs that should split into tasks. Use remove_fields only when one input mode should disable another.
  5. Max results naming: If your Worker accepts a maximum-results parameter, use the field name max_results. This is the conventional name recognized by the platform and downstream integrations.