The Proxyman Playbook for Seiue Script Resilience
A strategic guide to capturing, understanding, and rebuilding the Seiue attendance script by analyzing the underlying API contract. This comprehensive version includes an advanced guide for long-term maintenance and proactive adaptation.
0) Foundational Concepts: The “Why” Behind the Method
The web browser is just one client of Seiue’s API. Our script is another. The core principle is to make our script behave exactly like the browser, sending identical requests to the server’s API.
- Client-Side vs. Server-Side Logic: The web UI has limitations (e.g., disabling buttons). This is client-side validation, a “guardrail” in JavaScript for a better user experience. Our script bypasses this. We are only bound by server-side validation—the absolute rules enforced by the back-end API. This is the “fortress wall” we must respect.
- The Goal of This Playbook: To reverse-engineer the “secret handshake” (the API calls) between the browser and the server, so we can replicate it and adapt when it changes.
A) Prereqs & Setup
- Proxyman installed, with Helper Tool and SSL Certificate trusted.
- Enable macOS Proxy in Proxyman during capture.
- Use a clean browser profile (e.g., Brave in Incognito).
- Disable QUIC protocol in your browser (e.g.,
brave://flags→ “Enable QUIC” → Disabled). - Filter traffic to only these essential domains:
passport.seiue.com,api.seiue.com,chalk-c3.seiue.com.
B) Script ↔ Capture Mapping & Analysis
This section maps core script functions and parameters to the API calls you must capture.
1) Authentication: The Digital Keycard
- Concept: To prove who we are. The server issues a short-lived
access_tokenafter a successful login. We must present this “keycard” in the headers of all subsequent requests. - Code Touchpoints:
login_and_get_token(),self.session.headers.update(...). - Capture Points & Script Parameters:
- Login Request:
POSTtoself.login_urlwith form data{"email": ..., "password": ...}. - Token Request:
POSTtoself.authorize_urlwith form data{'client_id': ..., 'response_type': 'token'}.- Response JSON: You confirmed this is correct. It contains
access_tokenandactive_reflection_id.
- Response JSON: You confirmed this is correct. It contains
- Session Headers:
Authorization: Bearer ...,x-school-id,x-role,x-reflection-id.
- Login Request:
- Recovery (
401 Unauthorizederrors): Re-capture the login flow. See Appendix E.1 for advanced handling.
2) Schedule & Verification: The Source of Truth
- Concept: To ask the server “What work needs to be done?” and “What work is already complete?”.
- Code Touchpoints:
get_scheduled_lessons(),get_checked_attendance_time_ids(). - Capture Points & Script Parameters:
- Schedule:
GETtoself.events_url_templatewith paramsstart_time,end_time,expand. - Verification:
GETtoself.verification_urlwith paramsattendance_time_id_in,biz_id_in,biz_type_in,expand,paginated=0.
- Schedule:
- Recovery: Re-capture these GET requests and update JSON paths or query parameters in the code. See Appendix E.4 regarding pagination.
3) Student Roster: The Payload’s Building Blocks
- Concept: The submission payload requires a unique ID for each student (
owner_id). This call fetches that mapping. - Code Touchpoints: Inside
submit_attendance_for_lesson_group(). - Capture Point & Script Parameters:
GETtoself.students_url_templatewith paramsexpand=reflection,member_type=student. - Recovery: Re-capture and update the JSON path to
reflection.id. See Appendix E.5 for data type quirks.
4) Submit Attendance: The Critical Action
- Concept: To send a command to the server: “For these lessons, mark these students as present.” This is a
PUT/POSTrequest with strict server-side validation. - Code Touchpoints:
submit_attendance_for_lesson_group()payload construction. - Capture Point & Script Parameters:
PUTtoself.attendance_submit_url_templatewith a JSON payload containingabnormal_notice_rolesandattendance_records(a list of{tag, attendance_time_id, owner_id, source}). - Recovery (
422 Unprocessable Entityerrors): This is the most critical capture. Perform a single submission in the UI and replicate the entire JSON body structure 1:1. See Appendix E.6 for common payload drift scenarios.
C) Understanding Server-Side Error Codes (The Server’s Clues)
401 Unauthorized: Your Access Token is expired or invalid. The_re_authfunction should handle this by re-logging in.403 Forbidden: Your token is valid, but your account lacks permission. This is a server rule you cannot bypass.422 Unprocessable Entity: Your request was understood, but thepayloadcontent is invalid. The server’s validation rules rejected your data shape. Log the server’s error message; it often names the exact field that failed.409 Conflict: The action conflicts with the resource’s current state. This often signals a business logic rule was violated, such as “Submission window closed”.429 Too Many Requests: You are being rate-limited. The script’surllib3.Retryhandles this, but also check for aRetry-Afterheader.
D) The Disaster Recovery Flow (One-Click Rebuild)
Execute these steps in order when the script suddenly stops working.
- Prepare Sandbox: Enable Proxyman, use Incognito, disable QUIC, set domain filters. Clear the Proxyman list.
- Capture Auth: Log in from scratch. Capture
POST /loginandPOST /authorize. - Capture Schedule: Open today’s calendar. Capture
GET /events. - Capture Verification: Open the attendance overview page. Capture
GET /attendances-info. - Capture Roster: Open a class roster. Capture
GET /group-members. - Capture Submission: Submit attendance for one student. Capture the
PUT/POSTrequest. This is your ground truth. - Patch the Code: Make minimal, targeted edits to your script based on the captured ground truth.
- Test Incrementally: Test the patched script on a single, non-critical future lesson first.
E) Appendix: Advanced Maintenance & Drift Guide
This section contains proactive strategies for maintaining the script’s long-term health.
1. Advanced Token Lifecycle (Refresh vs. Re-login)
- Refresh Tokens: If you ever capture a
refresh_tokenin the/authorizeresponse, it means the API supports a more efficient re-authentication flow. Look for an endpoint (often/token) that acceptsgrant_type=refresh_token. Wiring this into_with_refresh()is preferable to a full re-login, as it’s faster and less resource-intensive. - Granular Error Logging: When a
401occurs, log the server’s JSON error body (masking secrets). Some APIs return specific error codes liketoken_expiredvs.invalid_token, which can help you decide whether to refresh or perform a full re-login.
2. Headers, Cookies, and Casing
- Casing: While HTTP headers are case-insensitive by spec, some server middleware is not. Always mirror the browser’s exact casing (e.g.,
X-School-Id) to avoid unexpected403errors. - New Headers: If your script starts failing with
403while the browser succeeds, look for new headers on the browser’s request, especially forPOST/PUTcalls. A new anti-CSRF header (likeX-CSRF-Token) is a common culprit.
3. Verification and Ground Truth
- Authoritative Signals: Treat
409/422error responses as authoritative. Log the full JSON error body, as it often contains a machine-readable code (window_closed) or message that explains the exact business rule you violated. - The Ground Truth: The only way to be 100% sure a submission succeeded is to re-verify. After a submission, always make a final call to
GET /sams/attendance/attendances-infoand confirm your submittedattendance_time_idnow appears in thechecked_attendance_time_idslist.
4. Payload Drift and Data Integrity
- Payload Drift: Be prepared for the submission payload to change over time. Common drifts include:
tag: "正常"changing tostatus: "present"or a numeric enum likestatus: 1.- Optional fields (
source,reason) becoming required. - If you get a
422, the server’s response body is your best friend—it will often name the exact field that failed validation.
- Roster Quirks: Your code correctly casts IDs to
int. Always assume API data might have inconsistent types (e.g., an ID being a string sometimes and an integer others). Yourseen_recordsset is a perfect defense against accidental duplicate records.
5. Network and Workflow Hygiene
- Pagination: You are currently using
paginated=0. If this parameter is ever deprecated, the API will likely fall back to default pagination (e.g., 10-20 items per page). You’ll need to implement logic to loop through pages usingpageandper_pagequery parameters, checking theX-Pagination-*response headers until all data is fetched. - Idempotency & Retries: If the API ever adds an
Idempotency-Keyheader, start sending a unique UUID with eachPUT/POSTrequest to prevent accidental double submissions on network retries. Your current(attendance_time_id, owner_id)de-dupe serves a similar purpose. For429errors, if aRetry-Afterheader is present, your script should ideally sleep for that specific duration. - Timezone Hygiene: Continue generating
start_time/end_timeinAsia/Shanghaiand format it exactly as captured. This prevents off-by-one-day bugs that can occur around midnight in different timezones.
6. Proactive Recovery: Baseline Snapshots
- Create a “Golden Set”: After confirming the script works, use Proxyman to capture a clean sequence of the five key requests (login → authorize → events → attendances-info → roster → submit). Export this session as a HAR file or Proxyman session (
.proxymanlog). - Commit Privately: Store this “golden” HAR file in a private location (not your public git repo).
- Accelerate Debugging: When the script breaks in the future, perform a new capture and use a
difftool to compare the new HAR file against your golden baseline. The differences will instantly reveal exactly what changed in the API contract.
7. Security and Logging Best Practices
- Mask Secrets: Never log full
access_tokenorrefresh_tokenvalues. Log the first and last few characters if needed for identification (e.g.,"Bearer sk_...w567"). - File Permissions: Ensure your
apiall.logfile permissions are not world-readable (e.g.,chmod 600 apiall.log) to protect any sensitive data it might contain.