Go Script
Go Script Demo
Section titled “Go Script Demo”GitHub Repository
Section titled “GitHub Repository”https://github.com/core-claw/GoScirptDemo
Required Files (Located in the Project Root Directory)
Section titled “Required Files (Located in the Project Root Directory)”├── main.go├── main├── input_schema.json├── README.md├── go.mod├── go.sum└── GoSdk ├── sdk.go ├── sdk_pd.go └── sdk_grpc_pd.go| File Name | Description |
|---|---|
main.go | Script source code |
main | Script entry executable (execution entry point), must be named main |
input_schema.json | UI input form configuration |
README.md | Project documentation |
sdk.go | Core SDK functionality (located in GoSdk directory) |
sdk_pd.go | Data processing enhancement module (located in GoSdk directory) |
sdk_grpc_pd.go | Network communication module (located in GoSdk directory) |
Go Scripts Must Be Built into an Executable Before Uploading
Section titled “Go Scripts Must Be Built into an Executable Before Uploading”set CGO_ENABLED=0set GOOS=linuxset GOARCH=amd64go build -o main ./main.goIt is recommended to compress the executable using UPX after building to significantly reduce file size.
Core SDK Files
Section titled “Core SDK Files”File Overview
Section titled “File Overview”The following three SDK files are required and must be placed in the root directory of the script:
| File Name | Main Function |
|---|---|
sdk.go | Core SDK functionality |
sdk_pd.go | Data processing enhancement module |
sdk_grpc_pd.go | Network communication module |
These files together form the script’s toolbox, providing all essential capabilities required for crawler execution and communication with the platform backend.
Core Feature Usage Guide
Section titled “Core Feature Usage Guide”1. Environment Parameters – Retrieve Script Input Configuration
Section titled “1. Environment Parameters – Retrieve Script Input Configuration”When the script starts, configuration parameters (such as target website URLs or search keywords) can be passed in externally.
Use the following method to retrieve them:
// Retrieve all input parameters as a JSON stringctx := context.Background()inputJSON, _ := coresdk.Parameter.GetInputJSONString(ctx)
// Example return value:// {"website": "example.com", "keyword": "technology news"}Use case: You can reuse the same script to crawl different websites or datasets simply by changing input parameters, without modifying the code.
2. Runtime Logs – Record Script Execution Process
Section titled “2. Runtime Logs – Record Script Execution Process”During execution, logs of different levels can be recorded and displayed in the platform console for monitoring and debugging:
ctx := context.Background()
// Debug information (most detailed, for troubleshooting)coresdk.Log.Debug(ctx, "Connecting to target website...")
// Informational logs (normal workflow)coresdk.Log.Info(ctx, "Successfully retrieved 10 news items")
// Warning logs (non-critical issues)coresdk.Log.Warn(ctx, "Network latency is high, performance may be affected")
// Error logs (execution failures)coresdk.Log.Error(ctx, "Failed to access target website, please check network connection")Log Levels:
- debug: Detailed debugging information (recommended during development)
- info: Normal execution flow
- warn: Warnings that do not stop execution
- error: Critical errors that require attention
3. Result Submission – Sending Data Back to the Platform
Section titled “3. Result Submission – Sending Data Back to the Platform”After data collection, results must be returned to the platform in two steps.
Step 1: Define Table Headers (Required)
Section titled “Step 1: Define Table Headers (Required)”Before pushing data, define the table structure (similar to defining column headers in Excel):
headers := []*coresdk.TableHeaderItem{ { Label: "Title", Key: "title", Format: "text", }, { Label: "Content", Key: "content", Format: "text", },}
ctx := context.Background()res, err := coresdk.Result.SetTableHeader(ctx, headers)Field Description:
- Label: Column name displayed to users
- Key: Unique identifier used in code
- Format: Data type, supported values:
"text"– string"integer"– integer"boolean"– true/false"array"– list"object"– dictionary/object
Step 2: Push Data Row by Row
Section titled “Step 2: Push Data Row by Row”After defining headers, push the collected data one record at a time:
type result struct { Title string `json:"title"` Content string `json:"content"`}
resultData := []result{ {Title: "Sample Title 1", Content: "Sample Content 1"}, {Title: "Sample Title 2", Content: "Sample Content 2"},}
ctx := context.Background()
for _, datum := range resultData { jsonBytes, _ := json.Marshal(datum)
res, err := coresdk.Result.PushData(ctx, string(jsonBytes)) if err != nil { coresdk.Log.Error(ctx, fmt.Sprintf("Failed to push data: %v", err)) return } fmt.Printf("PushData Response: %+v\n", res)}Important Notes:
- The order of setting headers and pushing data does not matter
- Keys in pushed data must exactly match the keys defined in headers
- Data must be pushed one record at a time
- Logging after each push is recommended for traceability
Common Issues & Notes
Section titled “Common Issues & Notes”- File placement: Ensure all SDK files are located in the script directory
- Imports: Use
SDKorCoreSDKdirectly to access SDK functionality - Key consistency: Data keys must exactly match table header keys
- Error handling: Always check return values, especially when pushing data
Script Entry File (main.go)
Section titled “Script Entry File (main.go)”Complete Code Example
Section titled “Complete Code Example”package main
import ( "context" "crypto/tls" "encoding/json" "fmt" "io" "net/http" "net/url" "os" "strings" coresdk "test/GoSdk" "time")
func run() { ctx := context.Background()
time.Sleep(2 * time.Second) coresdk.Log.Info(ctx, "golang gRPC SDK client started......")
// 1. Get input parameters inputJSON, err := coresdk.Parameter.GetInputJSONString(ctx) if err != nil { coresdk.Log.Error(ctx, fmt.Sprintf("Failed to get input parameters: %v", err)) return } coresdk.Log.Debug(ctx, fmt.Sprintf("Input parameters: %s", inputJSON))
// 2. Get proxy configuration (read from environment variable for flexible deployment) proxyDomain := os.Getenv("PROXY_DOMAIN") if proxyDomain == "" { proxyDomain = "proxy-inner.coreclaw.com:6000" } coresdk.Log.Info(ctx, fmt.Sprintf("Proxy domain: %s", proxyDomain))
var proxyAuth string proxyAuth = os.Getenv("PROXY_AUTH") coresdk.Log.Info(ctx, fmt.Sprintf("Proxy authentication: %s", proxyAuth))
// 3. Build proxy URL var proxyURL string if proxyAuth != "" { proxyURL = fmt.Sprintf("socks5://%s@%s", proxyAuth, proxyDomain) } coresdk.Log.Info(ctx, fmt.Sprintf("Proxy URL: %s", proxyURL))
// 4. Business logic processing (example) coresdk.Log.Info(ctx, "Starting business logic processing")
// Create custom HTTP client with proxy support httpClient := &http.Client{ Timeout: time.Second * 30, }
// If proxy is configured, set proxy transport if proxyURL != "" { proxyParsed, err := url.Parse(proxyURL) if err != nil { coresdk.Log.Error(ctx, fmt.Sprintf("Failed to parse proxy URL: %v", err)) return }
httpClient.Transport = &http.Transport{ Proxy: http.ProxyURL(proxyParsed), TLSClientConfig: &tls.Config{ InsecureSkipVerify: true, }, }
coresdk.Log.Info(ctx, "Proxy client configured") }
// Send request to ipinfo.io targetURL := "https://ipinfo.io/ip" req, err := http.NewRequestWithContext(ctx, "GET", targetURL, nil) if err != nil { coresdk.Log.Error(ctx, fmt.Sprintf("Failed to create request: %v", err)) return }
coresdk.Log.Info(ctx, fmt.Sprintf("Requesting: %s", targetURL))
resp, err := httpClient.Do(req) if err != nil { coresdk.Log.Error(ctx, fmt.Sprintf("Request failed: %v", err)) return } defer resp.Body.Close()
coresdk.Log.Info(ctx, fmt.Sprintf("Response status code: %d", resp.StatusCode))
body, err := io.ReadAll(resp.Body) if err != nil { coresdk.Log.Error(ctx, fmt.Sprintf("Failed to read response: %v", err)) return }
ip := strings.TrimSpace(string(body)) coresdk.Log.Info(ctx, fmt.Sprintf("Current IP address: %s", ip)) coresdk.Log.Info(ctx, "Business logic processing completed")
// 5. Push result data type result struct { Title string `json:"title"` Content string `json:"content"` }
resultData := []result{ {Title: "Sample Title 1", Content: "Sample Content 1"}, {Title: "Sample Title 2", Content: "Sample Content 2"}, }
for _, datum := range resultData { jsonBytes, _ := json.Marshal(datum)
res, err := coresdk.Result.PushData(ctx, string(jsonBytes)) if err != nil { coresdk.Log.Error(ctx, fmt.Sprintf("Failed to push data: %v", err)) return } fmt.Printf("PushData Response: %+v\n", res) }
// 6. Set table header headers := []*coresdk.TableHeaderItem{ { Label: "Title", Key: "title", Format: "text", }, { Label: "Content", Key: "content", Format: "text", }, }
res, err := coresdk.Result.SetTableHeader(ctx, headers) if err != nil { coresdk.Log.Error(ctx, fmt.Sprintf("Failed to set table header: %v", err)) return } fmt.Printf("SetTableHeader Response: %+v\n", res)
coresdk.Log.Info(ctx, "Script execution completed")}
func main() { run()}Automated Data Collection Script: Workflow & Principles
Section titled “Automated Data Collection Script: Workflow & Principles”1. Script Overview
Section titled “1. Script Overview”This script is an automation script that works like a digital employee. It automatically opens target web pages (such as social media sites), extracts required information, and organizes the data into structured tables.
2. How Does It Work?
Section titled “2. How Does It Work?”The entire process can be simplified into four main stages:
Step 1: Receive Instructions (Input Parameters)
Section titled “Step 1: Receive Instructions (Input Parameters)”Before execution, you provide instructions such as:
- Target page URL
- Number of records to collect
Step 2: Stealth Preparation (Proxy Network Configuration)
Section titled “Step 2: Stealth Preparation (Proxy Network Configuration)”To reliably access overseas or restricted websites, the script automatically configures a secure proxy channel.
Proxy Environment Variables:
| Environment Variable | Description | Example Value |
|---|---|---|
PROXY_DOMAIN | Proxy server address | proxy-inner.coreclaw.com:6000 |
PROXY_AUTH | Proxy authentication info | user:password |
Step 3: Automated Execution (Business Logic Processing)
Section titled “Step 3: Automated Execution (Business Logic Processing)”This is the core stage where the script:
- Visits target pages
- Extracts titles, content, images, and other required data
Step 4: Result Reporting (Data Push & Table Generation)
Section titled “Step 4: Result Reporting (Data Push & Table Generation)”After collection:
- Raw data is converted into standardized formats
- Results are saved to the system
- Table headers (e.g., “URL”, “Content”) are automatically generated