Simple web scraping with Bash: Ski Report
Linux Magazine|#262/September 2022
With one line of Bash code, Pete scrapes the web and builds a desktop notification app to get the daily snow report.
Pete Metcalfe
Simple web scraping with Bash: Ski Report

While recently doing a small project, I was amazed by how much web scraping I could do with just one line of Bash. I used the text-based Lynx browser [1] and then piped the output to a grep search. Figure 1 shows the one-line Bash example that scrapes the current snow depth from the Sunshine Village Snow Forecast web page.

In this article, I will introduce some techniques to easily scrape web pages, and then I will create a desktop notification script that provides the daily snow forecast.

The Lynx Text Browser

For my Bash web scraping, I started out by looking at using command-line tools such as curl [2] with the htm12text [3] utility. This technique definitely works, but I found that using the Lynx browser offers a one-step solution with a slightly cleaner text output.

To install Lynx on Raspian/Debian/ Ubuntu, use:

sudo apt install lynx

The Lynx -dump option will output a web page to text with HTML tags, HTML encoding, and JavaScript removed. Figure 2 shows that a Lynx dump can greatly clean up the original web page and make searching considerably easier.

Sometimes a simple Bash grep search might be all that you need. However, there are many cases where some text manipulation is required. The good news is that Bash has a nice selection of line and string manipulation tools.

هذه القصة مأخوذة من طبعة #262/September 2022 من Linux Magazine.

ابدأ النسخة التجريبية المجانية من Magzter GOLD لمدة 7 أيام للوصول إلى آلاف القصص المتميزة المنسقة وأكثر من 9,000 مجلة وصحيفة.

هذه القصة مأخوذة من طبعة #262/September 2022 من Linux Magazine.

ابدأ النسخة التجريبية المجانية من Magzter GOLD لمدة 7 أيام للوصول إلى آلاف القصص المتميزة المنسقة وأكثر من 9,000 مجلة وصحيفة.

المزيد من القصص من LINUX MAGAZINE مشاهدة الكل
MADDOG'S DOGHOUSE
Linux Magazine

MADDOG'S DOGHOUSE

Planning and community effort can help welcome Linux beginners online without precluding more advanced discussions.

time-read
3 mins  |
#293/April 2025: Trojan Horse
Cash as Cash Can
Linux Magazine

Cash as Cash Can

Mike Schilli uses the YNAB tool to keep an eye on his finances. Until recently, YNAB didn't have a terminal Ul programmed in Go, but Mike delivers it here.

time-read
9 mins  |
#293/April 2025: Trojan Horse
Innovator
Linux Magazine

Innovator

Re-inventing the Ubuntu experience

time-read
4 mins  |
#293/April 2025: Trojan Horse
Play video games natively on Linux Gaming Your Way
Linux Magazine

Play video games natively on Linux Gaming Your Way

Bazzite, an immutable Linux distro adapted for gaming, lets you play your favorite video games on your PC, handheld, or home theater PC.

time-read
9 mins  |
#293/April 2025: Trojan Horse
Installing mods on Steam Deck Steam Gems
Linux Magazine

Installing mods on Steam Deck Steam Gems

The Steam Deck gaming console offers a galaxy of creative modifications for the games you love to play.

time-read
10+ mins  |
#293/April 2025: Trojan Horse
Zack's Kernel News
Linux Magazine

Zack's Kernel News

Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

time-read
9 mins  |
#293/April 2025: Trojan Horse
System Monitoring
Linux Magazine

System Monitoring

Mission Center, a graphical system monitor, groups all important system statuses in a compact, intuitive interface.

time-read
4 mins  |
#293/April 2025: Trojan Horse
Exploring the Unbound DNS resolver Unbound
Linux Magazine

Exploring the Unbound DNS resolver Unbound

The Unbound DNS resolver offers comprehensive security and many other useful features.

time-read
8 mins  |
#293/April 2025: Trojan Horse
MakerSpace
Linux Magazine

MakerSpace

If you need to store long-term historical data, you can cobble together some Arduino modules, sensors, and displays and get them all to talk to an SQL server.

time-read
6 mins  |
#293/April 2025: Trojan Horse
Mix It Up
Linux Magazine

Mix It Up

Solve Bash blind spots by embedding other scripting languages into your Bash scripts to get the features you need. Pete shows you solutions for floating-point math, charting, GUIs, and hardware integration.

time-read
6 mins  |
#293/April 2025: Trojan Horse