SUEN

K12 教材 PDF 們

因為空出一台機子,謀劃將 AIK12 教科書項目做完。
從檢索一個關鍵詞,呈現 K12 教科書全部相關內容開始,AI 繼續實時對話。
先拿教材,大約三小時完成,考慮學生其實一直需要電子教材,順手開源了就:

Screenshot 2025-10-23 at 09.29.42.png (20251023001)

SmartEdu Textbooks Downloader — Operator’s Manual (v1.0)

Script: jks.sh
Last updated: 2025‑10‑23
Scope: Download K‑12 textbooks from the SmartEdu platform to your local machine or server, with resilient retries, resume, and a local HTML index page (no anti‑leech 403).


1) What this tool does

Important: The HTML index links to local files, not remote URLs, so the SmartEdu anti‑leech barriers (403 Forbidden) are not triggered.


2) Supported platforms

The script self‑checks and installs missing dependencies per distro. If a Python virtual environment (venv) cannot be created (e.g., minimal OS images), the script falls back to system Python transparently.


3) Quick start

3.1 Ubuntu/Debian (root or sudo)

bash
# Recommended: run as root on a clean VPS (it will auto-install python3/pip/venv if needed)
bash <(curl -Ls https://raw.githubusercontent.com/ieduer/bdfz/main/jks.sh)

3.2 macOS (Homebrew)

bash
# Ensure Homebrew is installed (https://brew.sh). Then:
bash <(curl -Ls https://raw.githubusercontent.com/ieduer/bdfz/main/jks.sh)

3.3 Non‑interactive “just run”

bash
# Example: High School, pick two subjects; skip interactive prompts
bash <(curl -Ls https://raw.githubusercontent.com/ieduer/bdfz/main/jks.sh) -p 高中 -s 语文,数学 -y

3.4 Force system Python (skip venv)

bash
# For minimal containers or constrained images
USE_SYSTEM_PY=1 bash <(curl -Ls https://raw.githubusercontent.com/ieduer/bdfz/main/jks.sh) -p 高中 -y

Tip: Interactive mode asks only two things (phase and subjects) and then starts immediately. Existing files are resumed without asking.


4) Interactive wizard (default)

When -y is not set and STDIN is a TTY, you’ll see:

  1. Phase (enter a number 1–6):
    1) 小学 2) 初中 3) 高中 4) 特殊教育 5) 小学54 6) 初中54
  2. Subjects (comma‑separated, leave empty for defaults):
    语文,数学,英语,思想政治,历史,地理,物理,化学,生物

After that the downloader starts—no confirmation prompt, no questions about retries or output path.
Existing files are validated and skipped or resumed automatically.


5) Command‑line options

text
-p PHASE       Phase (小学|初中|高中|特殊教育|小学54|初中54)
-s SUBJECTS    Comma-separated subjects (default: all standard subjects)
-m "WORDS"     Optional title keywords (space-separated) to narrow selection
-i IDS         Specific book IDs (comma-separated) to download
-o DIR         Output directory (default: ./smartedu_textbooks)
-R             Retry only the items recorded in failed.json
-c N           Max HTTP concurrency (integer). Leave unset for auto-tuning
-d N           Max disk write concurrency (integer). Leave unset for auto-tuning
-n N           Limit to first N books (debug/smoke test)
-T N           Post-round auto-retry rounds (default: 2; 0 disables)
-y             Non-interactive run (skip prompts; use provided options)
-h             Help

Environment variables


6) Output layout

text
smartedu_textbooks/
  ├─ <Subject>/
      ├─ <Title>.pdf
      ├─ <Title>__<id8>.pdf         # when same-title collision occurs
      └─ ...                        # partial/in-progress: <file>.part
  ├─ index.json                       # success list
  ├─ failed.json                      # failures (for -R), may be empty
  ├─ index.html                       # local, filterable index page
  └─ smartedu_download.log            # detailed run log

7) HTML index

7.1 Local preview

bash
cd smartedu_textbooks
python3 -m http.server 8000
# Open: http://localhost:8000/index.html

7.2 Share on your server (optional, production‑friendly)

bash
# One-time setup
sudo mkdir -p /srv/smartedu_textbooks
sudo rsync -a --delete smartedu_textbooks/ /srv/smartedu_textbooks/
sudo chmod -R o+rX /srv/smartedu_textbooks

# Nginx site
sudo tee /etc/nginx/sites-available/smartedu >/dev/null <<'EOF'
server {
    listen 80;
    server_name _;
    root /srv/smartedu_textbooks;
    index index.html;
    autoindex on;
    autoindex_exact_size off;
    autoindex_localtime on;
    location / { try_files $uri $uri/ =404; }
    types { application/pdf pdf; }
    add_header X-Content-Type-Options nosniff;
}
EOF
sudo ln -sf /etc/nginx/sites-available/smartedu /etc/nginx/sites-enabled/smartedu
sudo nginx -t && sudo systemctl reload nginx

# Update after each run
rsync -a --delete smartedu_textbooks/ /srv/smartedu_textbooks/

If you need access control, add auth_basic + htpasswd or put the site behind your VPN.


8) How selection & downloads work

  1. Catalog fetch → filters by phase & subjects (and optional keywords or IDs).
  2. Direct link resolution per book.
  3. Target path planning
    • Prefer a previously recorded path from index.json for the same book_id.
    • If the planned <Title>.pdf already exists (or collides), append __<id8> and retry with numeric suffixes as needed.
  4. Validation & planning
    • If a target already exists and looks like a valid PDF, it is counted as “already OK” (no task created).
    • Incomplete/corrupted or missing files are scheduled for download.
  5. Download (async, resumable) → .part.pdf.
  6. Post‑round retries — the round summary identifies failures and repeats up to -T rounds (default 2).
  7. Writes index.json, failed.json, and regenerates index.html.

9) Typical workflows


10) Notes on performance & reliability


11) Safety & compliance


12) Changelog (excerpt)


13) FAQ

Q: Can I run this as a normal user?
Yes. The script attempts a local venv. If creation fails and you don’t have sudo, use USE_SYSTEM_PY=1 and ensure you can write to the output directory.

Q: Can I pause and resume?
Yes—interrupt at any time. Rerun later; the downloader will validate, resume partials, and skip completed items.

Q: How do I filter to a single book?
Use -m "keyword1 keyword2" to match title words, or -i bookId1,bookId2 for exact IDs.

Q: Why does the browser download fail if I click a remote link?
Because of the Referer checks. Use the generated index.html (local links), or serve your local copies via a web server you control.


14) Support

If you hit a cryptic error, rerun with DEBUG=1 and share the tail of smartedu_textbooks/smartedu_download.log and the console snippet around the failure.