Section 8
Safety and WebFetch
This session gives Claude a practical internet toolkit: fetch pages, scrape data, save visual proof, download authorized media, and use browser automation carefully. The goal is power with guardrails.
Workshop Recording
Follow along with the live session. Hit play and the video will stick to the top as you scroll.
Claude Dangerously Skip Permissions
Run Claude in the open terminal
In your open terminal, start Claude in the workshop project folder before you install WebFetch.
claude --dangerously-skip-permissionsInstall and Register WebFetch
Install and register skills
Paste this prompt into the Claude session you already opened. Claude will download the public GitHub repo, install the dependencies, verify the tool, and add the skill notes to your skills.md or SKILLS.md file.
Use npm for the persistent install. Use npx only for one-time setup commands like installing Playwright's Chromium browser.
Take the Internet Apart
WebFetch is an agent built by Joe Che which wraps some of the most useful web fetch tools together, allowing you to easily take things from the internet and bring them into Claude in a clean, usable form.
Think of it as a workshop-grade internet toolkit. It helps Claude choose the right method for the job: quick page reading when the site is simple, a real browser when the site is dynamic, media downloading when you are allowed to save a video, and explicit safety checks when cookies, private pages, or browser control are involved.
This includes:
Scraping Data
Pull page text, tables, links, prices, headlines, and repeated elements into structured output.
Grabbing Videos
Download public or authorized video and audio for transcription, clipping, analysis, and archiving.
Moving Your Mouse and Clicking
Have your computer move your mouse, click things for you, scroll pages, and reveal dynamic content when a real browser is required.
One WebFetch Package
A compilation of multiple skills in one WebFetch workflow, so Claude can route the task instead of making you choose the tool by hand.
Using WebFetch Examples
Fetch a competitor's pricing page
WebFetch is instant market research. Give Claude a URL and it will read the page and analyze it for you:
Research process before a sales call
Before a call, have Claude pull together everything publicly available about the person or company:
Download an Instagram video
Use a public Instagram post as the class demo. Claude should still confirm authorization before using browser cookies or touching private media.
Heads Up
Tools Inside WebFetch
These are the practical commands Claude can use after the tool is installed and registered in your skills file. Each one is meant for a different kind of web task.
Fetch: Page Reader
Reads a public webpage and gives Claude the useful content back as Markdown, text, JSON, or raw HTML. Use it for articles, pricing pages, landing pages, help docs, product pages, and competitor research.
Extract: Precision Puller
Pulls a precise piece of a page with a CSS selector. Use it when you want every link, every headline, every price, all buttons, a table, or one repeated element instead of a full-page summary.
Screenshot: Visual Proof
Captures a page visually. Use it for design QA, before-and-after checks, proof that a page rendered correctly, or when Claude needs to inspect layout rather than just text.
PDF: Permanent Record
Saves a webpage as a PDF. Use it for client research archives, receipts, policy pages, references, or anything you want to preserve exactly as it appeared at the time.
Media: Video and Audio Capture
Downloads public or authorized video/audio through yt-dlp for offline analysis, transcription, clipping, or reference. This is the command students will use for demos like public Reels, YouTube videos, podcast clips, or approved client media.
Batch: Many URLs at Once
Runs the same fetch workflow across a list of URLs. Use it for competitor lists, source lists, lead research, content monitoring, or pulling many product pages into one research pass.
Cache: Faster Iteration
Reuses recent fetches so Claude does not keep hitting the same page while you iterate. Clear the cache when a page changed or when you need a fresh read.
Preflight + Status: Readiness Check
Checks whether the local install is healthy: Node, Playwright, optional Python tools, shot-scraper, and yt-dlp. Status shows what the agent is doing or what failed.
The engines underneath
WebFetch is a router. It looks at the job and chooses the best local engine instead of making you remember which tool fits which situation.
yt-dlp: The Media Retriever
yt-dlp is the media engine behind webfetch media. It supports thousands of extractor targets, including many common video, social, audio, education, news, and livestream sites. Examples commonly supported by yt-dlp include YouTube, Vimeo, TikTok, Instagram, X/Twitter, Reddit, Twitch, SoundCloud, Facebook, and many podcast or news video pages.
What it can download: public media, direct video/audio URLs, many embedded players, subtitles when available, audio-only versions, and authenticated media only when you explicitly provide browser cookies and are allowed to access that content.
What it cannot reliably download: DRM-protected streaming services, private posts you cannot access, paywalled content you are not authorized to use, expired or geo-blocked videos, and sites that changed their player after yt-dlp last updated. Even listed sites can break, so the honest test is to try the URL and keep yt-dlp updated.
Playwright: The Real Browser Scraper
Playwright opens Chromium and lets the page run JavaScript before Claude reads it. Use this when a site loads content after the page opens, hides data behind tabs, needs scrolling, or has a modern app interface. It is heavier than a simple HTTP fetch, but it sees the page closer to how a real visitor sees it.
Beautiful Soup: The HTML Surgeon
Beautiful Soup is a lightweight HTML parser. It does not act like a browser and it does not run the page. Use it after a page has already been fetched when you want to walk the HTML cleanly: find all links, pull headings, remove navigation, extract table rows, or isolate repeated elements.
Scrapling: The Fast Static Scraper
Scrapling is for pages that do not need a browser. It is usually faster than Playwright because it fetches and parses the page directly. Use it for blogs, docs, public articles, simple landing pages, and other pages where the useful content is already present in the HTML.
shot-scraper: The Screenshot Specialist
shot-scraper is a screenshot specialist. Use it when you need repeatable screenshots, multiple viewport sizes, selector screenshots, JavaScript setup before capture, or a batch of screenshots from a YAML file.
browser-use: The Autonomous Browser Agent
browser-use is for multi-step web tasks where Claude needs to browse, decide, click through pages, compare results, or keep going until it finds something. It is more agentic than normal fetch, so use it for research workflows rather than simple extraction.
OpenCLI Adapters: Known-Site Shortcuts
OpenCLI adapters are meant to be deterministic shortcuts for sites with known structures. When an adapter exists, the tool can use that adapter instead of spending tokens or browser time figuring out the site. This is optional and currently sits at the bottom of the stack because the core workshop value comes from fetch, extraction, screenshots, PDFs, and yt-dlp media workflows.
WebFetch Real-World Use Cases
01
Daily reporting
Have Claude log into your analytics dashboard every morning, take a screenshot, and summarize the numbers in a WhatsApp message.
02
Lead research
Paste a list of company names. Claude fetches each website and builds a one-line summary for each prospect.
03
Form automation
You describe what you want to create. Claude opens your project management tool and fills in the form.
04
QA testing
Ship a new page on your site. Claude clicks through it as a real visitor and flags anything broken or confusing.
05
Content monitoring
Fetch your industry news sources every morning and get a 3-bullet briefing on what is worth knowing.
06
Pricing intelligence
Fetch the pricing pages of 5 competitors and get a side-by-side breakdown with recommendations for your own pricing.
Install Local Whisper and Diarization
Install Whisper, model downloads, and diarization
Use one Claude prompt to set up local transcription for your operating system. On Apple Silicon Macs, Claude should use MLX Whisper. On Windows, Linux, or Intel Macs, Claude should use faster-whisper. The setup installs two diarization options so you can choose what fits your recording.
Diarization means speaker labels — Whisper transcribes the words, diarization identifies who said them. You have two options:
- →simple-diarizer — fully local, no account needed, installs in one command. Best for short or medium recordings with a small number of speakers.
- →pyannote.audio — more powerful for long recordings with many speakers. Requires a free Hugging Face account and accepting the model terms once. Sign up at huggingface.co — it is free.
Transcribe the downloaded video
Point your local Whisper install at the media file you downloaded with WebFetch. Use the small model for a live demo, then switch to the larger model when quality matters. The exact command depends on whether your installer chose MLX Whisper or faster-whisper.
Installing GuardDog
GuardDog is a command-line safety tool that checks packages before you install or trust them. It is designed to spot risky patterns like credential harvesters, obfuscated scripts, suspicious install hooks, crypto miners, reverse shells, and other common supply-chain attacks.
This install is intentionally simple: one npm install command, then one setup command. It works as a one-click style install across platforms that support Node and npm, including macOS, Windows, and Linux. Students still get the commands in their own copy blocks so Claude Code users and Codex users can paste the safest version for their tool.
Install GuardDog
Use the Copy Claude Code button for Claude Code. Use the Copy Codex Only button for Codex. Both versions should install GuardDog, run setup, and confirm the command is ready before you use it to inspect packages.
Heads Up
Get Your VirusTotal API Key
VirusTotal is a free online scanner that checks files, URLs, domains, and IP addresses against over 70 antivirus engines and security tools at once. Where GuardDog checks package code for suspicious patterns before you install, VirusTotal lets you scan the actual file or URL against the world's largest collective threat database.
The free API tier gives you 500 requests per day with a rate limit of 4 requests per minute. That is more than enough for personal use, client work, and workshop exercises. You only need a free account — no credit card required.
Create a free VirusTotal account and get your API key
Paste this into Claude to walk through the account setup and get your key saved somewhere useful.
Heads Up
You have now unlocked the system.
We now have a system.
We have our first line of defense to protect us.
We have a way to easily get any information from the internet into your hands.
Using Codex instead of Claude Code? Codex version of this page