The Proxyman Playbook for Seiue Script Resilience
A strategic guide to capturing, understanding, and rebuilding the Seiue attendance script by analyzing the underlying API contract. This comprehensive version includes an advanced guide for long-term maintenance and proactive adaptation.
0) Foundational Concepts: The “Why” Behind the Method
The web browser is just one client of Seiue’s API. Our script is another. The core principle is to make our script behave exactly like the browser, sending identical requests to the server’s API.
- Client-Side vs. Server-Side Logic: The web UI has limitations (e.g., disabling buttons). This is client-side validation, a “guardrail” in JavaScript for a better user experience. Our script bypasses this. We are only bound by server-side validation—the absolute rules enforced by the back-end API. This is the “fortress wall” we must respect.
- The Goal of This Playbook: To reverse-engineer the “secret handshake” (the API calls) between the browser and the server, so we can replicate it and adapt when it changes.
A) Prereqs & Setup
- Proxyman installed, with Helper Tool and SSL Certificate trusted.
- Enable macOS Proxy in Proxyman during capture.
- Use a clean browser profile (e.g., Brave in Incognito).
- Disable QUIC protocol in your browser (e.g.,
brave://flags
→ “Enable QUIC” → Disabled). - Filter traffic to only these essential domains:
passport.seiue.com
,api.seiue.com
,chalk-c3.seiue.com
.
B) Script ↔ Capture Mapping & Analysis
This section maps core script functions and parameters to the API calls you must capture.
1) Authentication: The Digital Keycard
- Concept: To prove who we are. The server issues a short-lived
access_token
after a successful login. We must present this “keycard” in the headers of all subsequent requests. - Code Touchpoints:
login_and_get_token()
,self.session.headers.update(...)
. - Capture Points & Script Parameters:
- Login Request:
POST
toself.login_url
with form data{"email": ..., "password": ...}
. - Token Request:
POST
toself.authorize_url
with form data{'client_id': ..., 'response_type': 'token'}
.- Response JSON: You confirmed this is correct. It contains
access_token
andactive_reflection_id
.
- Response JSON: You confirmed this is correct. It contains
- Session Headers:
Authorization: Bearer ...
,x-school-id
,x-role
,x-reflection-id
.
- Login Request:
- Recovery (
401 Unauthorized
errors): Re-capture the login flow. See Appendix E.1 for advanced handling.
2) Schedule & Verification: The Source of Truth
- Concept: To ask the server “What work needs to be done?” and “What work is already complete?”.
- Code Touchpoints:
get_scheduled_lessons()
,get_checked_attendance_time_ids()
. - Capture Points & Script Parameters:
- Schedule:
GET
toself.events_url_template
with paramsstart_time
,end_time
,expand
. - Verification:
GET
toself.verification_url
with paramsattendance_time_id_in
,biz_id_in
,biz_type_in
,expand
,paginated=0
.
- Schedule:
- Recovery: Re-capture these GET requests and update JSON paths or query parameters in the code. See Appendix E.4 regarding pagination.
3) Student Roster: The Payload’s Building Blocks
- Concept: The submission payload requires a unique ID for each student (
owner_id
). This call fetches that mapping. - Code Touchpoints: Inside
submit_attendance_for_lesson_group()
. - Capture Point & Script Parameters:
GET
toself.students_url_template
with paramsexpand=reflection
,member_type=student
. - Recovery: Re-capture and update the JSON path to
reflection.id
. See Appendix E.5 for data type quirks.
4) Submit Attendance: The Critical Action
- Concept: To send a command to the server: “For these lessons, mark these students as present.” This is a
PUT
/POST
request with strict server-side validation. - Code Touchpoints:
submit_attendance_for_lesson_group()
payload construction. - Capture Point & Script Parameters:
PUT
toself.attendance_submit_url_template
with a JSON payload containingabnormal_notice_roles
andattendance_records
(a list of{tag, attendance_time_id, owner_id, source}
). - Recovery (
422 Unprocessable Entity
errors): This is the most critical capture. Perform a single submission in the UI and replicate the entire JSON body structure 1:1. See Appendix E.6 for common payload drift scenarios.
C) Understanding Server-Side Error Codes (The Server’s Clues)
401 Unauthorized
: Your Access Token is expired or invalid. The_re_auth
function should handle this by re-logging in.403 Forbidden
: Your token is valid, but your account lacks permission. This is a server rule you cannot bypass.422 Unprocessable Entity
: Your request was understood, but thepayload
content is invalid. The server’s validation rules rejected your data shape. Log the server’s error message; it often names the exact field that failed.409 Conflict
: The action conflicts with the resource’s current state. This often signals a business logic rule was violated, such as “Submission window closed”.429 Too Many Requests
: You are being rate-limited. The script’surllib3.Retry
handles this, but also check for aRetry-After
header.
D) The Disaster Recovery Flow (One-Click Rebuild)
Execute these steps in order when the script suddenly stops working.
- Prepare Sandbox: Enable Proxyman, use Incognito, disable QUIC, set domain filters. Clear the Proxyman list.
- Capture Auth: Log in from scratch. Capture
POST /login
andPOST /authorize
. - Capture Schedule: Open today’s calendar. Capture
GET /events
. - Capture Verification: Open the attendance overview page. Capture
GET /attendances-info
. - Capture Roster: Open a class roster. Capture
GET /group-members
. - Capture Submission: Submit attendance for one student. Capture the
PUT/POST
request. This is your ground truth. - Patch the Code: Make minimal, targeted edits to your script based on the captured ground truth.
- Test Incrementally: Test the patched script on a single, non-critical future lesson first.
E) Appendix: Advanced Maintenance & Drift Guide
This section contains proactive strategies for maintaining the script’s long-term health.
1. Advanced Token Lifecycle (Refresh vs. Re-login)
- Refresh Tokens: If you ever capture a
refresh_token
in the/authorize
response, it means the API supports a more efficient re-authentication flow. Look for an endpoint (often/token
) that acceptsgrant_type=refresh_token
. Wiring this into_with_refresh()
is preferable to a full re-login, as it’s faster and less resource-intensive. - Granular Error Logging: When a
401
occurs, log the server’s JSON error body (masking secrets). Some APIs return specific error codes liketoken_expired
vs.invalid_token
, which can help you decide whether to refresh or perform a full re-login.
2. Headers, Cookies, and Casing
- Casing: While HTTP headers are case-insensitive by spec, some server middleware is not. Always mirror the browser’s exact casing (e.g.,
X-School-Id
) to avoid unexpected403
errors. - New Headers: If your script starts failing with
403
while the browser succeeds, look for new headers on the browser’s request, especially forPOST
/PUT
calls. A new anti-CSRF header (likeX-CSRF-Token
) is a common culprit.
3. Verification and Ground Truth
- Authoritative Signals: Treat
409
/422
error responses as authoritative. Log the full JSON error body, as it often contains a machine-readable code (window_closed
) or message that explains the exact business rule you violated. - The Ground Truth: The only way to be 100% sure a submission succeeded is to re-verify. After a submission, always make a final call to
GET /sams/attendance/attendances-info
and confirm your submittedattendance_time_id
now appears in thechecked_attendance_time_ids
list.
4. Payload Drift and Data Integrity
- Payload Drift: Be prepared for the submission payload to change over time. Common drifts include:
tag: "正常"
changing tostatus: "present"
or a numeric enum likestatus: 1
.- Optional fields (
source
,reason
) becoming required. - If you get a
422
, the server’s response body is your best friend—it will often name the exact field that failed validation.
- Roster Quirks: Your code correctly casts IDs to
int
. Always assume API data might have inconsistent types (e.g., an ID being a string sometimes and an integer others). Yourseen_records
set is a perfect defense against accidental duplicate records.
5. Network and Workflow Hygiene
- Pagination: You are currently using
paginated=0
. If this parameter is ever deprecated, the API will likely fall back to default pagination (e.g., 10-20 items per page). You’ll need to implement logic to loop through pages usingpage
andper_page
query parameters, checking theX-Pagination-*
response headers until all data is fetched. - Idempotency & Retries: If the API ever adds an
Idempotency-Key
header, start sending a unique UUID with eachPUT
/POST
request to prevent accidental double submissions on network retries. Your current(attendance_time_id, owner_id)
de-dupe serves a similar purpose. For429
errors, if aRetry-After
header is present, your script should ideally sleep for that specific duration. - Timezone Hygiene: Continue generating
start_time
/end_time
inAsia/Shanghai
and format it exactly as captured. This prevents off-by-one-day bugs that can occur around midnight in different timezones.
6. Proactive Recovery: Baseline Snapshots
- Create a “Golden Set”: After confirming the script works, use Proxyman to capture a clean sequence of the five key requests (login → authorize → events → attendances-info → roster → submit). Export this session as a HAR file or Proxyman session (
.proxymanlog
). - Commit Privately: Store this “golden” HAR file in a private location (not your public git repo).
- Accelerate Debugging: When the script breaks in the future, perform a new capture and use a
diff
tool to compare the new HAR file against your golden baseline. The differences will instantly reveal exactly what changed in the API contract.
7. Security and Logging Best Practices
- Mask Secrets: Never log full
access_token
orrefresh_token
values. Log the first and last few characters if needed for identification (e.g.,"Bearer sk_...w567"
). - File Permissions: Ensure your
apiall.log
file permissions are not world-readable (e.g.,chmod 600 apiall.log
) to protect any sensitive data it might contain.